THE BELL SYSTEM 


<= 





| echnical lournal 


DEVOTED TO THE nadie AND ENGINEERING 


ASPECTS OF ELECTRICAL COMMUNICATION 





VOLUME XLV NOVEMBER 1966 


NUMBER 9 





Programming and Control Problems Arising from Optimal Rout- 





COPYRIGHT © 1966 AMERICAN TELEPHONE AND TELEGRAPH COMPANY 


ing in Telephone Networks V. E. BENES 1373 
Random Tropospheric Angle Errors in Microwave Observations 
of the Early Bird Satellite J. H. W. UNGER 1439 
On the Sensitivity of Channel Capacity for the Gaussian Band- 
limited Channel I. W. SANDBERG 1475 
Phase Vocoder J. L, FLANAGAN AND R. M. GOLDEN 1493 
‘Theory of Error Rates for Digital FM J. E. MAZO AND J. SALZ 1511 
Noise in an FM System Due to an Imperfect Linear Transducer 
M. L. LIOU 1537 
Bounds for Certain Multiprocessing Anomalies R. L. GRAHAM 1563 
Phase and Amplitude Measurements of Coherent Optical Wave- 
fronts . J.T. RUSCIO 1583 
State of the Art in GaP Electroluminescent Junctions 
M. GERSHENZON 1599 
Schottky Barrier Photodiodes with Antireflection Coating 
M. V. SCHNEIDER 1611 
‘Topology of Thin Film RC Circuits F, W. SINDEN 1639 
et 
Contributors to This Issue 1663 
B.S.T.J. Briefs: Realizability Conditions for the Impedance 
Function of the Lossless Tapered Transmission Line 
P.L. ZADOR 1667 


THE BELL SYSTEM TECHNICAL JOURNAL 


ADVISORY BOARD 
P. A. GORMAN, President, Western Electric Company 


J. B. Fisk, President, Bell Telephone Laboratories 


B. S. GILMER, Executive Vice President, 
American Telephone and Telegraph Company 


EDITORIAL COMMITTEE 


W. E. DANIELSON, Chairman 

F, T. ANDREWS, JR. E. D. REED 

E. E. DAVID M. TANENBAUM 

Cc. W. HOOVER, JR. S. H. WASHBURN 
D. H. LOONEY Q. W. WIEST 

E, C. READ Cc. R. WILLIAMSON 


EDITORIAL STAFF 


G. E. SCHINDLER, JR., Editor 
A. HOWARD, JR., Assistant Editor 


m 


. M. PURVIANCE, Production and Illustrations 


jan} 


. J. SCHWETJE, Circulation 


yy 


THE BELL SYSTEM TECHNICAL JOURNAL is published ten times a year 
by the American Telephone and Telegraph Company, H. I. Romnes, President, 
C. E. Wampler, Vice President and Secretary, J. J. Scanlon, Vice President 
and Treasurer. Checks for subscriptions should be made payable to American 
Telephone and Telegraph Company and should be addressed to the Treasury De- 
partment, Room 2312C, 195 Broadway, New York, N. Y. 10007. Subscriptions 
$5.00 per year; single copies $1.25 each. Foreign postage $1.08 per year; 18 cents 
per copy. Printed in U.S.A. 


THE BELL SYSTEM 
TECHNICAL JOURNAL 


VOLUME XLV NovEeMBER 1966 NUMBER 9 


Copyright © 1966, American Telephone and Telegraph Company 
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In many circumstances a telephone call can be completed through a con- 
necting network in several ways. Hence, there naturally arise problems of 
optimal routing, that 1s, of making the choices of routes so as to achieve 
extrema of one or more measures of system performance, such as the loss 
(probability of blocking) or the carried load. 

As ts customary in traffic theory, a Markov process ts used to describe 
network operation with complete information. The controlled system is de- 
scribed by linear differential equations with the control functions (expressing 
the routing method being used) among the coefficients. Restricting attention 
to asymptotic behavior leads to a problem of maximizing a bilinear form 
subject to a linear equality constraint whose matrix is itself constrained to 
lie in a given convex set. An alternative approach first shows that minimiz- 
ing the loss, and maximizing the fraction of events that are successful at- 
tempts to place a call, are equivalent. This fact permits a dynamic program- 
ming formulation, which, in turn, leads to a very large linear programming 
problem. Two small examples are treated numerically by this method. 

It ts particularly amportant to try to verbalize, and then mechanize, the 
optimal routing strategies. In this endeavor, the linear programming formu- 
lation is of limited usefulness. Therefore, in the latter half of the work we 
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have attempted to use the special combinatorial structure imposed by the 
telephonic origins of the problem to shed light on the character of the optimal 
strategies. In particular, we show that for connecting networks with suitable 
combinatorial properties, the optimal route choices can be very simply de- 
scribed. Some of the results obtained were suggested by, and verify, conjec- 
tures from the practical lore of telephone routing. 

The problem of routing calls falls into two parts: Which attempted calls 
should be accepted in which states? What route should an accepted call use? 
The first problem ts very hard, and only sample numerical answers for small 
networks are obtained. We solve the second problem analytically for a large 
class of cases by appeal to combinatorial structure in the network. These 
cases can be described roughly as those in which the relative merit of states 
(as far as blocking 1s concerned) 1s consistent or continuous; 1.e., if a state 
x ws “better” than another y, then the neighbors of x are in the same sense 
“better” than the corresponding neighbors of y. An abundance of examples 
indicates that these cases are numerous and so warrant attention. In a net- 
work with this kind of combinatorial property, a policy which rejects no 
unblocked calls and minimizes the number of additional calls that are blocked 
by completing an attempted call differs from an optimal policy only in that 
the latter may reject some calls. 


I. INTRODUCTION 


A telephone connecting network invariably provides many paths on 
which a particular telephone call can be completed. One of the operational 
problems faced by the control unit of a telephone system is then to as- 
sign to each accepted and completable call a path and, in particular, to 
choose these assigned paths in the best way. This is the problem of opti- 
mal routing of telephone calls. Thus, in the theory of telephone traffic 
there naturally arise mathematical problems of optimal routing, that is, 
of making choices of routes in probabilistic models for operating net- 
works so as to achieve extrema of well-defined measures of system per- 
formance, such as the probability of blocking (loss). 

Unfortunately, it is not unfair to state that the voluminous probabilis- 
tic theory of telephone traffic, now some sixty years old, still has rather 
little to say about how routes for calls should be chosen. We are speaking 
here of the mathematical theory of traffic. Naturally, a wealth of useful 
information about routing has accumulated over the years from experi- 
ence in the telephone field; recently it has been buttressed and extended 
by many simulation studies. This information, nevertheless, still lies 
largely outside the province of the existing theory of telephone traffic. 

It is the aim of this work to formulate, study, and (in part) solve a 
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general class of optimal routing problems for telephone networks. The 
formulation of these problems is undertaken insofar as possible within 
the classical dynamical theory of telephone traffic initiated by A. K. 
Erlang, that is, in terms of Markov processes based on the assumptions 
of (2) negative exponential distributions for mutually independent hold- 
ing-times, and (iz) randomly originating traffic. To these assumptions is 
added a description of how attempted calls are accepted and assigned 
routes. 

We conclude this introduction with a brief summary of the entire 
paper. A complete summary appears later (Section TX) after concepts 
for formulating the problem have been discussed. As is customary in 
telephone traffic theory, we use a Markov process to describe the opera- 
tion of the connecting network under study. The Kolmogorov equations 
for this process then constitute a set of linear differential equations de- 
scribing the controlled system; in these the control functions expressing 
the routing method being used appear among the coefficients. It is nat- 
ural to restrict attention to asymptotic behavior; this leads to a problem 
of maximizing a bilinear (or linear fractional) form subject to linear 
constraints; this problem is equivalent to a linear programming problem. 
An alternative approach first shows that minimizing the probability of 
loss, and maximizing the fraction of events that are successful call at- 
tempts, are equivalent. This fact permits a classical dynamic program- 
ming approach. The remainder of the paper attempts to use this ap- 
proach to establish relations between combinatorial properties of the 
network and the policy(ies) optimal for given criteria of performance. 
In particular, it is shown that for connecting networks having certain 
“monotone”’ properties, optimal policies for minimizing loss correspond 
closely to the heuristic advice, “Prefer those states in which as few calls 
are blocked as possible”’. 


II. INFORMATION FOR ROUTING DECISIONS 


The problem of choosing “‘good”’ routes for information flow in a com- 
munications network is vastly complicated by the difficult questions 
surrounding the collection, updating, and relevance of information 
(about the state of the system) on the basis of which routing decisions 
are to be made. Thus, one of the items to be chosen in designing a rout- 
ing scheme is the information on which the routing is to be based. In- 
deed there is a whole spectrum of possible choices for this information, 
from no information at all (except what is unwittingly discovered in 
making call attempts), to full knowledge of the state of the connecting 
network. Clearly, a practical compromise between total ignorance and a 
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very expensive, complex scheme based on many data must usually be 
made. 

Our considerations in this work will be limited to the case of perfect 
information, in which the microscopic state of the connecting network is 
assumed known and available for making routing decisions. This case is, 
of course, very far from realistic: few existing or envisaged systems utilize 
even a small fraction of this possible information for routing. Indeed, 
much of it is likely to be of very little relevance. Nevertheless, it is im- 
portant to know what would be good routing if we could implement it 
and could afford it, so the full information case to be considered here 
forms at worst a limiting situation for which some theory is available, 
and a natural starting point for investigation. 


III. ACCEPTANCE OR REJECTION OF UNBLOCKED CALLS 


In the present discussion of the involved problem of routing calls, one 
of the difficulties that arises deserves special mention. This difficulty is 
the problem of deciding whether to accept or reject attempted calls 
which are not blocked. 

At first sight, it might seem that no unblocked call attempt should 
ever be rejected. The natural argument for this view is that the whole 
point of a telephone system is to complete calls, and that by rejecting 
an attempt that could have been completed, the system only lowers its 
performance. Sensible as this argument sounds, it is unacceptable be- 
cause it turns out that whether rejection of an unblocked call improves 
or lowers performance depends on the index of performance, on the dis- 
tribution of traffic among the sources, on the “community of interest” 
aspects of the system, etc. If the probability of blocking is used as an 
index, the ‘‘bad”’ effect of adding a particular call in a given state of the 
system may be so great and so lasting that it is better to reject the call, 
and improve the chance of completing many later calls. 

To put the matter another way, the problem of routing with full in- 
formation seems at first to boil down to the question: ‘“‘Which of the 
paths available for call c in state x should be used?” This form of the 
problem overlooks the possibility that perhaps the best thing to do when 
the state is x and c is attempted is not to complete c at all, but to reject 
it! In other words, it assumes that, naturally, c will be put up in state 
x if it is attempted in x and is not blocked. This assumption has always 
been made in previous applications of the model we use.!? 

Conceivably, then, it is better to reject a call c that, is not blocked in a 
state x. Thus the problem of routing should be phrased: “Should a call 
c, free and not blocked in state x, be completed, and if so, by which 
route?” oad 
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It turns out that answering the first part of the question, as to which 
calls should be completed in which states, is often the hardest part of the 
problem. Examples can be given in which it is fairly easy to solve the 
route selection part of the problem, but for which the question of whether 
a call should go in or not is not settled. That this question has substantial 
practical import is apparent from the simulation studies carried out by 
J. H. Weber,’ which clearly show how in trunking networks prohibition 
of circuitous routes (and thus rejection of certain unblocked calls) can 
improve system performance. 

J. H. Weber‘ has also remarked that the problem of deciding whether 
an unblocked call should be refused is closely related to the distinction 
between trunking networks, used in toll systems to interconnect towns 
and cities, and central office networks, used to interconnect trunks and 
customers’ lines at a single location. An important combinatorial differ- 
ence between the two types of networks depends on whether all calls 
use the same number of links. This is usually the case in central office 
networks, but rarely true in trunk networks. One result suggested by 
this distinction would be that a call should always be put up when all 
calls use the same number of links, but that circuitous routes might be 
profitably disallowed otherwise. 

It appears then that network structure bears on the problem of what 
calls to accept. However, examples can be given which show that even 
when there is almost no network structure, other factors such as the dis- 
tribution of traffic and the ‘‘community of interest”? can make rejection 
of some calls part of an optimal policy. 

For example, if two lines calling at rates \1 , \2 , respectively, c poimpete 
for one trunk, the probability of blocking is 


2r1d2 
Ar + Ag + 2ArA2’ | 
if no unblocked call is rejected. If the calls of the line calling at rate A; 


are always rejected, the probability of blocking (with rejected calls in- 
cluded among the blocked) is 


AiA2 + Ad | 
At + Ao + Arr 


(We have assumed that all calls have unit mean holding-time.) It fol- 
lows here that if 


Ae > Ar + Az + Arde 


then it is better to reject all ; calls than to put them all in! This exam- 
ple, although somewhat unrealistic, illustrates how the distribution of 
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traffic affects the rejection problem, even in the absence of network struc- 
ture. 

For an example involving the ‘‘community of interest’’, consider two 
disjoint sets of (n + 1) lines communicating over one trunk, with the 
quirk that each set has a distinguished line which only attempts calls to 
the distinguished line in the other set, while the other n lines of one set 
only attempt calls to the n nondistinguished lines of the other set. Let 
c be the call consisting of the two distinguished lines talking to each 
other. If c is always rejected, the probability of blocking is 


1+ r»AW(n — 1)” 
n? + dn? (n — 1)?’ 


where we have assumed that lines which call each other do so at rate 2, 
and holding-times have unit mean. If c is always accepted when it is not 
blocked, then the probability of blocking is 


ann? + rn?(n — 19° 
2Q0n? + 1+ n? + dAn2(n — 1)2° 


From these formulas it follows that it is better to reject c entirely if n is 
large enough, or if \ is large enough, while if \ is small enough it is better 
always to accept c. 


IV. STATES, EVENTS, AND ASSIGNMENTS 


The elements of the mathematical model to be used for our study of 
routing separate naturally into combinatorial ones and probabilistic. 
The former arise from the structure of the connecting network and from 
the ways in which calls can be put up in it; the latter represent assump- 
tions about the random traffic the network is to carry. The combinatorial 
and structural aspects are discussed in this section; terminology and 
notation for them are introduced. The probabilistic aspects are con- 
sidered in a later section. 

A connecting network v is a quadruple v = (G,J,Q,S), where G is a 
graph depicting network structure, I is the set of nodes of G which are 
inlets, Q is the set of nodes of G that are outlets, and S is the set of per- 
mitted states. Variables x,y,z at the end of the alphabet denote states, 
while u and v (respectively) denote a typical inlet and a typical outlet. 
A state x can be thought of as a set of disjoint chains on G, each chain 
joining J to Q. Not every such set of chains represents a state: sets with 
wastefully circuitous chains may be excluded from S. It is possible that 
I = Q, that 1 Q = 6 = null set, or that some intermediate condition 
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obtain, depending on the ‘“‘community of interest” aspects of the net- 
work ». 

The set S of states is partially ordered by inclusion S, where x S y 
means that state x can be obtained from state y by removing zero or 
more calls. If x and y satisfy the same asszgnment of inlets to outlets, 
i.e., are such that all and only those inlets uw € J are connected in x 
to outlets v € Q which are connected to the same v in y (though possi- 
bly by different routes), then we say that x and y are equivalent, written 
LY Y. 

The set S of states determines another set & of evenis, either hangups 
(terminations of calls), successes (successful call attempts), or blocked — 
or rejected calls (unsuccessful call attempts). The occurrence of an event 
in a state may lead to a new state obtained by adding or removing a call 
in progress, or it may, if it is a blocked call or one that is rejected, lead 
to no change of state. Not every event can occur in every state: naturally, 
only those calls can hang up in a state which are in progress in that state, 
and only those inlet-outlet pairs can ask for a connection between them 
in a state that are idle in that state. The notation e is used for a (general) 
event, h for a hangup, and ¢ for an attempted call. If e can occur in x we 
write e € w. A calle € x is blocked in a state x if there is no y € S which 
covers x in the sense of the partial ordering < and in which ¢ is in prog- 
ress. For h € x, x — his the state obtained from x by performing the 
hangup h. 

We denote by A, the set of states that are immediately above z in the 
partial ordering S, and by B, the set of those that are immediately 
below. Thus, 


A, = {states accessible from x by adding a call} 
B, = {states accessible from 2 by a hangup}. 


For an event e € x, the set A.z is to consist of those states y ~ x to which 
the network might pass upon the occurrence of e in x. Thus, if e isa 
blocked call, Acz = {6}; also 


U Anz = B, 
hex 
U Acs ae ee 


cea 
ec not blocked in z 


The number of calls in progress in state x is denoted by ||. The 
number of call attempts c € x which are not blocked in x is denoted by 
s(x), for “successes in x.”? The functions | - | and s(-) defined on S play 
important roles in the stochastic process to be used for studying routing. 
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It can be seen, further, that the set S of states is not merely partially 
ordered by S, but also forms a semilattice, or a partially ordered system 
with intersections, with 2 M y defined to be the state consisting of those 
calls and their respective routes which are common to both zx and y. 
(See G. Birkhoff,’ p. 18, ex. 1 and footnote 6.) 

An assignment is a specification of what inlets should be connected to 
what outlets. The set A of assignments can be represented as the set of 
all fixed-point-free correspondences from I to 2. The set A is partially 
ordered by inclusion, and there is a natural map y(-): S — A which 
takes each state x € S into the assignment it realizes; the map y(-) is 
a semilattice homomorphism of S into A, since 


«2 y implies y(x) 2 7(y), 
y(e@Ny) Sve) Nyy). 


V. ROUTING MATRICES 


It will be assumed throughout this work that attempted calls to busy 
terminals are rejected, and have no effect on the state of the network; 
similarly, blocked attempts to call an idle terminal are refused, with no 
change of state. Attempts to place a call are completed instantly with 
some choice of route, or are rejected, in accordance with some policy of 
routing. 

Two mathematical descriptions of how routes are assigned to calls 
will be used. The first, the routing matrix, is convenient for writing the 
Kolmogorov equations for the Markov processes representing network 
operation. The second, called a policy, affords a convenient notation for 
the actual determination of optimal routing methods for various net- 
works to be described in detail later. Either description is a rule or 
doctrine for routing. 

A routing matrix R = (rz,),2,y € S, has the following properties: for 
each x € S, let I, be the partition of A, induced by the equivalence 
relation ~ of “having the same calls up,” or satisfying the same assign- 
ment of inlets to outlets; then for each Y € Iz, rz, for y € Y is a possi- 
bly wmproper probability distribution over Y, (that is, it may not sum 
to unity over Y), 


l22 = s(x) = is Tay » 
ye Az 


and r,, = O in all other cases. 
The interpretation of the routing matrix R is to be this: any Y € I, 
represents all the ways in which a particular call c not blocked in x 
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(between an inlet idle in x and an outlet idle in x) cowld be completed 
when the network is in state x; for y € Y, rz, is the chance that if this 
call c is attempted in 2, it will be completed by being routed through the 
network so as to take the system to state y. That is, we assume that if 
c is attempted in x, then with probability 


Vs Door, (1) 


it is rejected (even though it is not blocked), and with probability r,, 
it is completed by being assigned the route which would change the 
state x to y, for y € A.z. The possibly improper distribution of proba- 
bility {rzy,y € Y} indicates how the calling rate \ due to ¢ is to be 
spread over the possible ways of putting up the call c, while the improper 
part (1) is just the chance that it is rejected outright. 

This description of routing matrices is a generalization of that used 
in Refs. 1 and 2 in that it permits, in the nonvanishing of (1), the rejec- 
tion of unblocked calls forbidden in the cited references. 

Thus, a routing matrix R is any function on S’ with rz, = 0, rz, = 0 
unless y € A, or y = x, and such that 


‘or = s(x) = De Pay 


YEA 


and 


» Ty S|, 
yEAce 
for all c € x not blocked in x. A routing matrix corresponds to a fixed 
rule if rzy = 0 or 1 for x ¥ y; otherwise it corresponds to a randomized 
rule. The convex set of all possible routing matrices is denoted by C. 
A policy is a function g: & X S—S such that c,h € x imply 


g(ea) € A.z U {x} 
g(hc) = a—h. 


It is apparent that a policy is equivalent to a fixed rule; the circumstance 
that ¢(-,x) is defined also for hangups h is useful in the sequel. Varia- 
bles y,y are used to denote policies. 

The routing rules and doctrines that might be considered here are of 
course more numerous by far than those we have introduced above. 
In particular, time-dependent rules and history-dependent rules are 
natural generalizations. However, since we will be considering only time- 
invariant traffic and ergodic Markov processes as representations of 
operating networks, such generalizations add little of significance. 
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An important point, however, is that the routing methods here con- 
sidered are based on a complete knowledge of the state of the system, 
i.e., we postulate that we are in the case of “‘perfect information.’”’ This 
postulate is grossly unrealistic for present day electromechanical tele- 
phone systems; for an electronic system with a very large and very 
cheap memory, it becomes realistic: the state of the network can ac- 
tually be stored and the routing rule in use represented by a giant trans- 
lator. Such a procedure overcomes the obvious impracticality of deter- 
mining the state by examination of the actual network, and is actually 
used in the Bell System’s No. 1 ESS (Electronic Switching System).° 

The routing matrices R used in Refs. 1 and 2 had the property that 
if a call is not blocked in a state, then it is completed in some way; only 
blocked attempts or attempts to busy terminals are rejected. Thus none 
of these rules for routing resembles the methods that are at present 
likely to be used in practice. However, since C contains rules that reject 
certain calls in certain states, even though these calls are not blocked, it 
turns out that a large class of routing rules which do mirror what might 
happen in practice is included in C. 

Some of the simplest routing rules are not based on any knowledge 
about the current state of the network. Given a call c that has been 
attempted, they provide a list of routes to be tried in order; the first 
route found available is used for the call. The list may include all possi- 
ble routes for c, or only some of them. It is easy to construct a routing 
matrix to represent such a rule. Let 71,72, --- , 7» be the routes to be 
tried for a call c. For each state x in which ¢ can occur, let rz, = 1 if 
use of the first 7; that is available in x takes the system from zx to y, 
and let r., = 0 for all other y € A... If no route for c that is available 
in xis among 1, --- , 72, then c is rejected in x even though it may not 
be blocked, simply because the “‘sieve” for finding routes is too coarse. 

It was assumed in the previous paragraph that no information about 
the state was used. If it is known, e.g., in which element A of a parti- 
tion II of S the state currently is, a similar rule can be represented by a 
class of lists (of routes to be tried in order), one for each A € II. The 
same kind of construction then yields the appropriate R. Here the A 
such that x, € A is acting as the “information state.” 

Thus, many R from C which reject certain calls in certain states de- 
scribe a rule which closely resembles what is done in practice, e.g., in 
the translator of the Bell System No. 4A crossbar switching system. 


VI. PROBABILISTIC ASSUMPTIONS AND STOCHASTIC PROCESSES 


A Markov stochastic process x; taking values on S is used as a mathe- 
matical description of an operating connecting network subject to random 
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traffic. It is assumed that this operation is in accordance with one of the 
routing matrices R of Section V. The rest of the process x; is based on 
two simple probabilistic assumptions: 


(t) Holding-times of calls are mutually independent variates, each 
with the negative exponential distribution of unit mean. 

(27) If wis an inlet idle in state vz, and v ¥ wu is any outlet, there is a 
(conditional) probability 


AA + o(h), A >0 
that u attempt a call tov in (¢+h) ifa: =z, ash—- 0. 


The choice of unit mean for the holding-times merely means that the 
mean holding-time is being used as the unit of time, so that only the 
traffic parameter \ needs to be specified. 

It is convenient to collect these assumptions and the chosen routing 
matrix R into one transition rate matrix Q = (qz,) characteristic of 2; : 
this matrix is given by 

1 if y € B, 
Nay if Y € A; 
dey = . (2) 
= {a} —ABb) =n) ft y =o 
0 otherwise. 


In terms of the transition rate matrix Q, it is possible to define an ergodic 
stationary Markov stochastic process {z;, ¢ real} taking values on S. 
The matrix P(t) of transition probabilities 


Pry (t) = Prix = y [xo = 2} 


satisfies the equations of Kolmogorov 


d 
7 P(t) = QP) = PIR, (0) = J, 
and is given formally by the formula 


P(t) = expiq. 


Since the zero state (the state with no calls in progress) is accessible 
from any state in a finite number of steps with positive probability, the 
process has only one ergodic class, and there exists a unique nonnega- 
tive row-vector 


p = {pz,x € S} 
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such that asi — © 


p 
Na ug ede 


p 


and p satisfies the “statistical equilibrium” or stationarity condition 
p Q = 0, which can be written out in full in the simple form 


[| x | + As(a) = NeclDx = 2 Py + » 2 PyTyz y rE S. 


It is possible that a confusion arises in the mind of the reader as to 
whether we are talking about central office connecting networks or large 
trunk networks such as the toll system. For in telephone traffic theory 
these two areas of application are often described by different models: 
a ‘finite-source” model like the present one, in which the conditions of 
the inlets and outlets form a significant part of the state of the system, 
is commonly used for the former; an ‘‘infinite source” model, with groups 
of customer’s lines reduced to Poisson sources of traffic, is frequently 
used for the latter. The reason for this difference is that it has simply 
turned out to be sufficient, in the toll case, to restrict attention to the 
trunking network as the object of principal interest, and to use the sim- 
pler Poisson description of sources. 

In principle, of course, the model to be used here serves to describe 
either area listed above, although in the toll case it naturally demands 
use of a very large number of states. Thus, in the sequel we make no at- 
tempt to distinguish the toll case from the central office case. This view- 
point is justified by the fact that the results to be obtained are robust 
under passage from finite- to infinite-source models, or they can be re- 
formulated and reproved in the infinite-source context. 


VII. FORMULATION OF THE ROUTING PROBLEM 


The most common figure of merit used by telephone traffic engineers 
for evaluating connecting networks is the probability of blocking, the 
fraction of call attempts that are blocked. It is natural, therefore, to use 
this quantity as the objective function in our optimization problem of 
routing. It has been shown? for the process x; to be studied here that if 
no unblocked call is rejected the probability of blocking (in the mnemonic 
form Pr{bl}) is given in terms of the stationary state probability vector 
» by the formula 
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where 
8, = number of idle inlet-outlet pairs that are blocked in state z, 
a, = number of idle inlet-outlet pairs in state x. 


By the same methods it follows that for a process x, defined in terms 
of an R € C the fraction of attempted calls which are not completed 
(are “‘lost’’), be it because they were blocked or simply rejected, is 
given by 


p(8 +r) 
p'e 
where r = {rz2,v € S} is the diagonal of the routing matrix R. 
We can now replace the informal problem of minimizing, by suitable 
routing, the fraction of call attempts that are lost by a precise problem 
of mathematical programming, as follows: Choose R € C so as to achieve 


? 


/ 
min Pp (8 an r) 
p'a 
subject to p'Q = 0, p'1 = 1, and p = 0. (The ‘1’ in ‘p'I’ is the vector 
with all components 1.) Of the constraints, the first is the equilibrium 
condition on p, the second states that the components of p sum to one, 


and the third says that p is nonnegative. It is understood, of course, 
that Q is to be related to R by (2) or, what is the same, by 


where H = (Ay) is the “hangup matrix” such that h,, = 1 or 0 ac- 
cording as y € 8, or not. 

Several authors have formulated routing problems for communica- 
tions systems. Many of these problems have dealt with systems of the 
store-and-forward type, in which information is alternately stored at 
and transmitted from a node in the network without setting up a ‘“‘con- 
tinuous path” from source to destination. Such formulations are inap- 
plicable to telephone systems. A possible exception, though, is that’ 
of R. Kalaba and M. Juncosa which, for a given amount of traffic be- 
tween each specified source and destination, and a given network having 
capacity constraints, attempts to find continuous routes that are best 
in the sense of maximizing the delivered traffic by solving a linear pro- 
gramming problem. 

In its possible application to telephony, this model envisions a given 
traffic pattern (i.e., a description of who wants to talk to whom) to be 
satisfied at a particular moment, and tries to find a way of routing as 
much of this traffic as possible through the network. In our terminology, 
a traffic pattern is an ass7zgnment a(-), and satisfying it means finding 
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an x € S such that y(z) = a. The amount of traffic carried is simply 
the number | x | of calls in progress. Of course, it is not always possible 
to satisfy an assignment. Thus, Kalaba’s and Juncosa’s formulation 
translates into our setup as follows: Given an assignment a(-) either 
find x € S with y(z) = a, or else if a(-) is unrealizable, find x € S 
which realizes as much of a(-) as possible, i.e., such that y(x) S a and 
|x| is a maximum. This can be rephrased as follows: If a(-) is given, 
form the cone 


K = Ka) =({ar a= a, 


and within y'(K) pick a state x that is maximal in that |x| = | y | 
for each y € y (K). 

It is to be emphasized that this problem is markedly different from 
our form of the routing problem. The former is purely combinatorial in 
character. There is no parameter such as the traffic \ per inlet-outlet 
pair, so the problem involves no probability, and can have nothing to 
do with the “grade of service” as customarily employed by telephone 
engineers. Furthermore, the whole formulation overlooks the fact that 
in present systems call completions must be made without disturbing 
calls already in progress. 


VIII. PRINCIPLES OF ROUTING 


It is important to distinguish methods of routing from principles of 
routing. A method of routing is a specific way of accepting or rejecting 
attempted calls and choosing routes in a particular system, e.g., that 
implicit in the translator of the Bell System No. 4A crossbar switching 
system. A principle of routing is a kind of general prescription of what 
constitutes* “good” or “optimal” routing; it is the backbone of many 
routing methods that might be based on it. 

A principle of routing is particularly useful if it has two properties: 


(z) It is relatively simple and intuitive to state. 
(72) There is a substantial class of systems for which it describes 
the (or part of the) optimal routing method. 


In our mathematical setting a method of routing corresponds roughly 
to a rule R € C. We shall see that the “best” rule R € C can be ob- 
tained by solving a linear programming problem. Now if it should hap- 
pen that for an interesting class of networks the solutions of these linear 
programs had some common characteristic, some combinatorial property 


* Or, more usually, of what someone’s intuition tells him constitutes. 
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of the sets of states of the networks that served as an alternate descrip- 
tion of the linear program solution, then this characteristic or property 
could be abstracted into a genuine principle of routing. 

Alternatively, one could formulate as conjectures some intuitive prin- 
ciples of routing, and then try to determine for what classes of networks 
Gf any!) these principles did, in fact, describe the optimum routing 
methods. This second approach will be followed in the present work; the 
rest of this section is devoted to a discussion of some a priorz reasonable 
candidates for “‘good” routing rules. All of these candidates are expres- 
sions of one and the same idea, namely, that one routing rule is better 
than another if it avoids more “‘bad” states, where a ‘‘bad” state x is 
one for which 8, is high. This idea is not just an attractive first approxi- 
mation to ‘‘good”’ or even optimal routing; it leads at once to conjectures 
for which our results later in the paper provide strong support in precise 
ways. 

In spite of the lack of general theoretical knowledge about routing, 
traffic engineers have developed various conjectures and intuitive ideas 
about what might constitute ‘good’? methods for choosing routes. These 
conjectures are a natural starting place for any rigorous approach to 
routing, because the formulation of precise theoretical models in which 
routing can be studied at once raises the question, ‘‘Which of these 
methods, conjectured to be good, can be proved to be optimal in some 
theoretical model?” Since many of these methods are relatively simple 
to describe, and hence to mechanize, established answers to this question 
would have immediate practical applications. Some of these conjectures 
will now be discussed. 

It is apparent that in a telephone system, putting up a new call can 
only increase the number of idle pairs that are already blocked. Another 
way of saying this is that in giving service, 1.e., in realizing an attempted 
call in a connecting network, one is possibly denying service to certain 
inlets and outlets presently idle, who might attempt a call in the very 
immediate future. This observation has given rise to a number of routing 
rules (for systems with blocked attempts refused) of great intuitive 
appeal, which can be described collectively by the admonition: To de- 
crease (minimize?) the probability of blocking, put in new calls in such 
a way as to minimize the additional congestion resulting from the new 
calls. 

It is illuminating to discuss particular forms of this advice. One form 
is this: Route new calls through the most heavily loaded part of the net- 
work that will accept them. Another is: Put in a given new call so as to 
minimize the chance that the next attempt to place a call be blocked. 
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Or: Avoid blocking states, that is, prefer states in which fewer idle 
pairs are blocked. 

For all the intuitive appeal possessed by these rules, rather little is 
known about them. Nevertheless, they provide conjectures that will be 
examined in the precise setting of our theoretical model to yield, we hope, 
the beginnings of a mathematical theory of optimal routing. Let us see 
what these rules enjoin in terms of our model. If we put up a call c so as 
to take the system to a state y, the chance that the next event is a blocked 
call attempt is 


By 
| y | + ray 


Suppose that we just left state 2, so that y € A... This probability will 
be smallest if y was chosen according to the “maximum s(-)” policy, 
that is, 


s(y) = max s(z), 
2€ Aca 
i.e., if we prefer states in which fewer idle pairs are blocked. Thus, in our 
model the second two forms of the above advice coincide. 

Another conjecture arises out of consideration of gradings in which 
calls overflowing certain primary routes are pooled and offered to over- 
flow circuits. Here a natural expectation is that one should always ‘“‘fill 
the holes in the multiple,’ meaning by this that a primary route should 
be used whenever possible, so that the overflow is left available to as 
many lines as possible. It will be shown for certain examples that if calls 
are accepted unless they are blocked, then this rule both describes the 
optimum routing choices, and is equivalent to the “maximum s(-)” 
policy of the previous paragraph. 


IX. SUMMARY AND DISCUSSION 


In Sections I to VII the problem of routing calls in a telephone net- 
work has been formulated as a mathematical one within Erlang’s basic 
traffic theory. Some routing rules which are intuitively reasonable can- 
didates for “good’”’ or even optimal routing were described in Section 
VIII. 

Since the expansion of {p,,2 € S} such that p'Q = 0, p > 0, is 
known,” it is natural to start in Section X with a consideration of 
Pr{bl} for low traffic: \ — 0. We have 


| 2 | 


Pe = Ports + o(n'*"), 4 — 0, 
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where r, is the number of strictly ascending (in S) paths from 0 to x 
which are permitted by R. If x is a blocking state it contributes a term 


Debs _ a o(n!*!); 1-0 
p'a |x|! p’ 





to Pr{bl} if no calls are rejected. It follows that for sufficiently low traffic 
the policy that minimizes r, is optimal within the policies that reject no 
calls. In a similar way, it can be shown that always refusing a call c 
cannot be optimal for \ sufficiently small, and that there is never any 
point in rejecting a call attempt in a state x with 


[x] <min{|y|: y € S, By > 0}, 


for \ small enough. 

The nonlinear problem of choosing R to minimize Pr{ bl} is reduced to 
a linear programming problem in Section XI. This reduction substan- 
tially facilitates obtaining numerical results, examples of which appear 
later in this summary. 

In an effort to identify optimal routing policies, attention now (Sec- 
tion XII) shifts away from the formal linear programming approach to 
the underlying Markov process. It is shown that minimizing Pr{bl}, and 
maximizing the fraction of events which are successful call attempts, 
are equivalent; this fact leads to a direct dynamic programming ap- 
proach, in which 

min Pr{bl} 


REC 
and 


lim mn” * max E{number of successful call attempts in n events} 
(with the maximum in the second expression over all possible policies 
for n events) are both achieved by essentially the same stationary policies. 
The word ‘essentially’ hides the inherent nonuniqueness of optimal 
policies due to symmetries in the network and to the possible presence 
of transient states. 

In Section XIII it is shown, following C. Derman, that minimum 
blocking is achieved by a fixed rule. 

The mathematical programming problems arising in this new approach 
are again of the linear programming type, and are similar to those arising 
in Section XI. Our principal interest, however, does not remain with 
calculating numerical solutions, but shifts abruptly to the relationships 
of these solutions to the combinatorial structure of the network. Thus, 
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the second half of this paper consists less of suitable programming prob- 
lems than of intuition and combinatorics applied to exhibit (in parte or 
in toto) the solutions of these problems and their dependence on and origin 
in network structure. 

The attempt to discover and characterize optimal policies in a whole- 
sale way by appeal to network combinatorics (rather than piecemeal by 
numerical calculation) begins in Sections XIV and XV with considera- 
tion of some simple examples; these lead to the introduction of some 
‘“‘monotone”’ properties (of connecting networks) which impose the con- 
dition that (roughly) the relative merit (as far as blocking is concerned) 
of states is consistent or continuous, i.e., that if a state x is “better” 
than another y, then the neighbors of x are in the same sense ‘‘better”’ 
than the corresponding neighbors of y. 

Consideration of these properties is justified by the facts that (2) 
they appear in the examples, and (iz) they yield a series of closely knit 
results (Theorems 7-15) that go far to bear out the heuristic guesses in 
Section VIII about the nature of good routing. In particular, in a net- 
work with one of the monotone properties, a policy which rejects no 
unblocked calls and minimizes the number of additional calls that are 
blocked by completing an attempted call differs from an optimal policy 
only in that the latter may reject some calls. In other words, the “max 
s(-)” policy is optimal to within rejection of calls. 

Each monotone property gives rise to a corresponding isotony theorem 
which gives a numerical expression to the relative merits of routes for 
calls that are implicit in the purely combinatorial monotone property. 
The relevance of these isotony theorems to optimal routing is explained 
heuristically in Section XVI. The theory culminates, in Section XVIII, 
with two optimal routing theorems based on the monotone properties. 
When one of these properties obtains, these results completely answer 
the question: Which route should be used for an accepted call when there 
is a choice of routes? Determining the extent to which these combina- 
torial properties occur in networks of interest appears to be the next 
major problem in any continuation of the present study. 

It is to be stressed that the monotone properties we introduce serve 
only to identify the route that a call should take 7f zt 2s to be accepted; 
they do not in any way help to decide which calls should be accepted. 
Except for the low-traffic results of Section X, and the (obvious and 
easily proved) fact that in a nonblocking network no call should be re- 
jected, the problem of acceptance or rejection of calls remains an enigma. 
Some light on it is shed by the numerical results that immediately follow 
this summary. 
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The paper concludes in Appendix A with the remark that if the per- 
formance index is modified so as to put greater emphasis on “early 
blocked attempts”, i.e., ones occurring soon after the system is started, 
then no calls should be rejected. The result is proved in detail for this 
index: the expected number of events until the first blocked attempt. 
Such a criterion corresponds to trying to avoid the undesirable event, the 
blocked call, as long as possible. 

We turn now to numerical results obtained by solving the linear pro- 
gramming formulation of Section XI for two simple networks. The first 
is the three-stage Clos network with 2 < 2 switches depicted in Figs. 1 
and 2, and already considered as an illustration of routing in Refs. 1 
and 2. The second is a 6-line to 4-trunk concentrator in which each line 
has access to 2 trunks; it is shown in Figs. 3 and 4. In this second case, 
the probabilistic model was modified to make \ > 0 the calling-rate per 
idle line, rather than that per idle inlet-outlet pair. 

In each example, both the minimal probability of blocking, and the 
probability of blocking under random routing, were calculated for several 
values of \ by use of the LP90 program. To be more precise, two linear 
programming problems were solved for each example; the first deter- 
mined the optimal policy, the second determined the optimal policy 
among those policies which assigned random routes to accepted calls. 

Several important qualitative features of the optimal routing policy 
were the same in both examples and are described together in the follow- 
ing list: 


(t) The optimal policy rejected no calls. 
(zz) The routes assigned by the optimal policy coincided with those 
that keep s(-) as large as possible. 
(zzz) The optimal policy was the same for all values of the traffic 
parameter \ examined. 
(zv) The improvement over random routing brought about by optimal 
routing decreases as the traffic \ increases. 


= 2x 2 SWITCH 





Fig. 1—3-stage Clos network with 2 X 2 switches. 
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Fig. 2— States of 3-stage Clos network of Fig. 1. 


Under the constraint that accepted calls be routed at random the op- 
timal policy was again to accept all unblocked attempted calls. 

Results for the Clos network are given in Fig. 5 and Table I. It is ap- 
parent that for low \ optimum routing gives a loss that is easily an order 
of magnitude less than that due to random routing. At high values of 
\ the difference all but disappears. This behavior is explained in part by 
the fact that there is no blocking in the “upper” states of Fig. 2; when 
d is very large the system spends all its time in these states; when X is 
low, however, the occasion for a choice between states 2 and 4 often 
arises and a correct choice makes a significant difference. (At very low 
values of \ the difference will again decrease because only state 1 will 
ever be visited with any frequency.) 

Results for the concentrator are shown in Tig. 6 and Table II. They 
include a numerical comparison with hand-calculated loss figures from 
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unpublished work of 8. P. Lloyd dated circa 1953. At that time Lloyd 
studied this particular concentrator model, correctly guessed the optimal 
policy, proved its optimality for low A, and calculated the loss for some 
values of \. This example exhibits the behavior, conjectured in Ref. 2, 
p. 275, that a good (here, optimal) policy make certain ‘‘bad”’ states 
transient states. The state numbered 9 is such a transient state under the 
optimal policy found numerically by the linear programming method. 

The present study of routing in telephone networks has suggested a 
number of conclusions and conjectures: 


(7) 


(72) 


(iid) 


(zv) 


The problem of optimal routing of calls in telephone connecting 
networks (with full information) can be formulated and solved 
with Erlang’s classical theory of traffic. In this endeavor, the 
contrasting techniques of machine calculation and combinatorial 
analysis can be employed either as alternative methods or as 
complementary approaches. 

The problem separates into two parts, that of deciding which 
calls to accept, and that of choosing routes for accepted calls. 
Analytically, the first part appears to be much harder than the 
second, which frequently has a simple intuitive solution closely 
related to the structure of the network. 

Posed within Erlang’s theory, the routing problem can be reduced 
to a (usually very large) linear programming problem and at- 
tacked numerically, or studied in terms of Markov decision proc- 
esses and dynamic programming. 

In an apparently wide class of connecting networks, certain 
natural monotone properties and some isotonies based on them 


TRUNKS 


LINES 


Fig. 3 —6-to-4, 2 access concentrator. 
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Fig. 4—States of 6-to-4, 2 access concentrator. 


are the key to choosing optimal routes for accepted calls. The re- 
sulting optimal policies are remarkably easy to describe and to 
instrument; they agree fully with some of the conjectures de- 
veloped over years of practical experience in telephony; they are 
even robust under changes of performance index. Naturally, 
each example studied here involves a very small network. Never- 
theless, the fact that the monotone properties turned up in each 
of a substantial number of small networks of diverse structure 
suggests that they are also present in larger ones. Whether this is 
so is a topic for future research. In any case, the examples we 
offer indicate that the theory of routing here developed applies 
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equally well to central office networks and to various gradings 
and concentrators. 

(v) In the interesting area of low traffic, optimal routing can be as 
much as an order of magnitude better than random routing; 
with high traffic the advantage decreases rapidly. In all the ex- 
amples studied, the optimal routing policy was independent of 
the traffic \; this suggests that in most cases the optimal policy 
is basically a combinatorial feature of the network alone, and is 
probably optimal in many probabilistic models of network opera- 
tion. 

(vt) There are situations in which attempted calls should be rejected 
even though they are not blocked. Simple examples of this 
phenomenon all seem to be rather unnatural; but J. H. Weber® 
has discovered it numerically in trunking networks, and has sug- 
gested* that it is associated with unequal lengths of paths for 
calls. The examples we studied numerically in the present work 
did not show it; but they had the property that all paths for calls 
were of the same length. We conjecture that there is a large class 
of “regular, well-behaved, normal, ete.’’ networks in which no 
optimal policy rejects an unblocked call, and that in general oc- 
casions on which such calls should be rejected are rare. Even if 
they occur in practical central office networks, these occasions 
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Fig. 5— Pr{bl} for Clos 3-stage network with 2 X 2 switches. 
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NETWORK FOR OPTIMAL AND RANDOM RouTING 














Pr{bl} 
x 
Optimal Random 
0.01 0.00000181 0.00018319 
0.05 0 .00015926 0.00334468 
0.1 0.00087324 0 .00960844 
0.2 0.00376107 0.02259477 
0.5 0.01593861 0 .04807122 
1.0 0.03146853 0.06360424 
2.0 0.04381783 0.06670098 
3.0 0.04584041 0 .06206897 
5.0 0.04233249 0.05152606 
10.0 0.03115608 0.03463135 
30.0 0.01405820 0.01459520 
50.0 0.00901346 0 .00922144 
100.0 0.00475109 0.00480733 











Tig. 6 — Pr{bl} for 6-to-4, 2 access concentrator for random and optimal routing. 
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TaBLE JI — PROBABILITY OF BLOCKING FoR 6-To-4, 2 AccEss 
CONCENTRATOR FOR OPTIMAL AND RANDOM RovuTING 














Optimal 
ON a Sk ee Random 
(S.P. Lloyd) (Author) 
0.1 0.0049 0.00536231 0.00864729 
0.2 0.02093718 0.02972292 
0.4 0.0716 0.07170622 0.08856109 
0.7 0.1628 
1.0 0.2478 0.23154056 0.24943320 
2.0 0.4498 0.44971622 0.46141067 








probably should be taken seriously (by a company committed to 
giving service) only if they are demonstrably associated with 
large amounts of congestion or a near-breakdown in operation. 
Hence, finding optimal policies to within rejection of calls may be 
considered a ‘‘practical” solution of the routing problem originally 
posed. 


X. SOME COMPARISON THEOREMS FOR LOW TRAFFIC 


There are two ways in which a theoretical analysis can substantially 
further progress in the problem of routing: (7) by means of local compari- 
son theorems that establish that one method of routing is better than 
another, and (iz) by means of global optimality theorems that exhibit (in 
part or overall) one or more optimal policies which actually achieve the 
best possible value of the performance index in use. In this section, we 
prove some comparison theorems which are valid asymptotically as the 
traffic parameter approaches zero. At first, we restrict the analysis of the 
present section to the case!? in which no unblocked call is rejected if it is 
attempted, so that we avoid the difficult question of deciding whether 
an attempted call that is not blocked should be completed or rejected. 

For a first glimmer of insight, we shall examine the formula 


/ 


Pr(pt} = 2", @=Q(R), Ra fixed rule 


p'o 





valid when no unblocked call is rejected, in the very common situation 
in which there is an integer greater than zero, n say, such that there is 
no blocking in states with fewer than n calls in progress, and there are 
states with n calls in progress in which some calls are blocked. In this 
case it is known’” that 
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Pom 2 t28x2 -+ o(A"), 


n Fg 


pia = po Do TT as + 0("), 


pp 
(3) 


1,2 
as \ — 0, where 


I] 


number of paths on S ascending from 0 to # and permitted by R 
( (bye 
the 0,2 entry of the | x |-th power of R, (4) 


lz 


I 


I 


and 


a; = number of idle inlet-outlet pairs in a state having 7 calls in 
progress. 


(We recall that for the important cases of one- or two-sided networks 
Az = a2) = a; for all x with |z| = 7.) It follows from (8) that for 
small \ the leading term is critical: the blocking will depend principally 
on how easy it is to reach a blocking state from the zero state, with this 
“ease”? measured by the number 


DX. 182 = (R"B)o 


|zj=n 


I 


the number of ways in which a blocked call can arise with- 
out having any hangups, starting at Zero. 


If the matrix F is not fixed, but allows some random choices of route, 
then this quantity can still be viewed as the “expected number of ways 
in which a blocked call can arise without having any hangups, starting 
at zero.” It is apparent that this number is given by fo, where the 
numbers {f,,| «| < n} are defined by the nonlinear recurrence 


[ Bs |x| = 
fe = » min fy | | om, 


cE YEAcx 
ec not blocked in x 


Indeed we have the result: 


Lemma 1: 


Dd. tbe = fo for REC 


|zl=n 


Proof: Let R be given and let 
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Bz lal =n 
ds = > reydy a eee (5) 


YEA 


We prove the stronger result that d, 2 f, . It is clear that 
do= >, tbe, Gd =fe for |x| =n. 


|z|=n 


If d, = f, for|y| = k +1, then for |x| =k 


d; = OB Vaydy 2 Ds Tay dy 
yeAy yeAg 
>. min f, = fz. 


cidleinz yeAcy 
ce not blocked in z 


IV 


We shall say that R € C puts x € S on an ascending path to a state 
zif and only if Ayo, --- , yj with yo = 0, | ys | = 2, yj) = 2, and ry,y,,, = 
1 fort = 0,---,|z2| — 1, and z > O is among y1,--- , yz. Let D 
be the subset of all fixed rules R € C such that if |z| = n, and if R 
puts x,y with y € A, on an ascending path to z, then r,, = 1 only if, 
with c = y(y — 2), 

Jj NI Figs 
wteAcz 
The numbers {f, , | «| < n} are the key to optimal routing for low values 
of , or to put it more picturesquely, they are the key to staying as far 
away as possible from the blocking states in {z: |a| = n}, which are 
the ones that provide the leading term in Pr{b/} as \ — 0. We have 


Theorem 1: Let R € Dand R* € C — D. Then for all \ small enough 
Pr{bl}e < Pr{bl} nx . 


Proof: Let d,* be defined in terms of R* according to (5) used in Lemma 
1. Since R* € D, there exist 2,y,c, and ¢ > 0, such that 


y€ Az, yy —2)=¢, ray = 1 


fy 2 mnf.+e, (6) 
2€ Ace 
and a maximal chain 0 = Yo, Y1,Y2, °°: » Yizi-1, Yj2) = & ascending 
in S such that 
Hie Sle VO lee 1. 


Now, using d* 2 f, 
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De To2*d2* + fy 


A,—{y} 
fe ae €, 


the last inequality a consequence of d* = f and the definition of f. Simi- 
larly, if dy,,.° > fus,,» then 


* ‘i *) * * 
dy; = De Vy zz d.* + Pe 


Ay;—(yi41) 


d.* 


ll 


I 


IV 


= Rae 


Since yo = 0, we have dy’ > fo. 
Setting a = fo, a* = do’, and 


= 


k! j= 


Me: 





b= Aj, 
k: 


ll 
o 


we have the asymptotic forms 
ate 
— b+6 


* * 
Pr{bl} ne = Nay ote 





Pr{b}r 


with ¢,6,e",5° all o(1) as\ > 0, anda < a”. Since b increases as \ > 0 
(a —a* +e — e*)b < a®5 — ad + &*5 — £8" 
for all ) small enough. This is equivalent to 
ab + cb + ad" + 60° < ab + eb + 0°5 + cS, 


ate .a*+ c* 
b+ 6 b+ 6*’ 


and proves the theorem. 

Low traffic analyses of the kind just employed can also shed some 
light on the problem of rejecting or accepting unblocked calls. For ex- 
ample, if a call c is always refused in every state, then 

heel 


and 
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Pr{bl} = pGry atte 
Ss as A> 0. 
Qo 


However, if no unblocked call is rejected, then Pr{bl} — 0 as \ — 0. 
Thus, always refusing c cannot be optimal if \ is sufficiently small. 
For another example, suppose as before that 


n= min{|y|: B, > 0} > 0, 
yes 
and let ¢ be a call which is refused by FR in some state x with |x| < n. 
It is easy to see that for the rule R 


|x 
Peter) te + 0") 


ao + 0(A) 
On the other hand, if the rule R; refuses no unblocked calls, 


W= 
ao + o(A) 


Pr{bl) = ~ ae ty By + o(A") 


where the superscript 1 indicates that R, supplants R in (4). For A 
small enough, then 


A es ae a) 

Tall = > al oa By 
and FR, is better than R. Thus, there is never any point in refusing an 
unblocked call attempt made in a state x whose norm or dimension is 


less than the minimum norm achieved by the blocking states, if » is 
small enough. 


XI. REDUCTIONS TO LINEAR PROGRAMMING PROBLEMS 


Our effort to choose, with full information about the state of the net- 
work, routes for new calls so as to minimize the probability of blocking 
has led, upon the assumption of a simple probabilistic description for the 
traffic, to this problem of mathematical programming: To minimize 


Pp is r) (7) 
pla 
subject to p = 0,p'1 = 1, p'Q = 0,Q = Q(R), REC. 
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It is relatively easy to see that this problem can be formulated as one 
that has a bilinear (or linear fractional) objective function, and linear 
constraints. We change variables to U = (uz,) and ue; defined by 


Usy = Pal ny aye 8, y € Az 
Uce = Pz — De ce xv, Au 6, 


yEAcz 


Ure = > 


ce€z 
ce not blocked in z 


Teer 
Conversely, we introduce p in terms of U by setting 


WN. ah ace eae: 


\a| vez, 


ee Uz + oe Urzy 


YEA . 
marc an if s(x) > 0. 


If cis a call which can be completed in state z, then A., ¥ 6, and 
\ 2 Uny 
yEAcz 
is the equilibrium rate at which c is completed in state x, and 


Nice = zr — dX Dy Ury 


YEAce 


is the equilibrium rate at which c is rejected in state x. 

The transformation of variables from p to { U,u2} necessitates adding 
additional constraints if a sensible problem is to result. :vidently, for 
c € x not blocked in x 


Dp = eee Dd Uzy - 


YEAce 


The left-hand side does not depend on c. For different ¢ € x not blocked 
in x all these formally different ways of calculating p, must agree, and 
it is, therefore, necessary to impose the additional constraint that 


cc €y(Ae —- x) =vly: y=2—2 for z € Acs} implies 
Ucx =F Ds Uny = Ucrz a Ss Usy + 


YEAcz YEActz 


The condition pQ = 0 then gives the condition, for s(v) > 0, 


ae (uet 2 tn) + 2 dey = ty + te) tN DS thy 


ye Az yEeAz yA yEeBr 
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to be satisfied by U. Naturally, the condition U = 0 is imposed. We 
define # in terms of U by 


0 unless y € Az or y=2, 
Ury r 
Tey = Uw Ss Uyz s ye As, 
z€Ag 
s(z) — DS ry if y = Zz. 
yEAz 


} 
The normalization condition p'1 = 1, finally, amounts in terms of U to 


Md : DS tye + De (eau X, tr) a1, 


Cayo || eB, s()>0 s(x) 


In terms of U the objective function is 


r >, Be SS Uyz + >: Be (wee > ver) Bie Une 





s(z)=0 \ar| yYeBy s(z)>0 s(x) yEAx 
Ay Az 
Bosh Bt Sanity (Mt Et) 


It is possible to describe linear programming problems which are 
equivalent to our nonlinear problem of optimal routing. Two ways of 
reducing (7) to a linear programming problem will now be discussed. 
The first is due to A. Charnes and W. W. Cooper.’ Let g = tp, where 
the scalar t = 0 is to be chosen go that g’a = a, with a > 0 a specified 
real number. Consider now the “adjoined” linear programming problem 
of finding q,t,r minimizing g' (8 + r), subject to g,¢ = 0, gl —t=0, 
7Q = 0, 7a = 4,Q = Q(R),r = r(R) = {rez 2 € S},R € C. (The 
argument just described shows that the constraints are linear. ) 


Theorem 2: For any a > 0, of g,t,r ts a@ solution of the “adjoined” linear 
problem, then p = q/t ts a solution of (7). 
Proof: It is necessary to show first that indeed t > 0. Suppose q,0,7 is a 
solution. Then g’1 = 0 and gq = 0 imply g = 0, so that q’a = 0; but 
ga = a> 0. Hence, t > 0. 

If p Q = O and Q = Q(R), we use 7, to mean the vector {rrz, x € S}. 
Now suppose that there is a solution p of (7) for which 


pie +t) J (@+r) _ g(b+r) (8) 


pa qa a 


Now pa > 0, because for any R € C the corresponding value of po 


1404. THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1966 


(0 = zero state, with no calls up) is > 0, and a > 0. Hence, there is a 
6 > Osuch that p'a = 6a. Consider g = 0 'p,f = 0°. Then 


0 n'a =Ga=a 
and 4,f satisfy 4,f = 0, q’Q = 0, g/1 — é = 0. But, 
p(B +t) _ 6 p(B+m%) _FB+r) _ F(B+7r) 


pa pa Ga a. 
Hence, (8) implies g’(@ + rp) > q (8 +r), because a > 0. This contra- 
dicts the optimality of q,t,r for the ‘‘adjoined”’ problem. 
A cognate reduction to a linear programming problem can be ob- 
tained from a lemma of C. Derman,”’ included for completeness: 


Lemma 2: The nonlinear function 


/ 
Cx 
g(x) = We 


can be minimized subject tox = 0, Ax = b, by solving a linear program- 


ming problem if (4) Ax = 0,2 2 Oimply x = Nand (it) « 2 0, Ax = b 
imply d'x > 0. 


Proof: Conditions (¢) and (i) imply that the transformation 


a5 
d’x 
1 
d'x 
is one-to-one between {x = 0, Ax = b} and z satisfying z = 0, dz = 1. 
and Bz = 0, where 


B = (Ab). 


Under the transformation g(x) becomes a linear function. It can be 
verified that (7) and (iz) of Derman’s lemma apply to the routing prob- 
lem (7). 


XII. REFORMULATION AS A MARKOV DECISION PROCESS 


In Section VII the problem of optimal routing was cast as that of 
minimizing the probability of blocking, a bzlinear or linear fractional 
functional of the equilibrium probability vector p, subject to linear con- 
straints. In Section XI it was shown how this problem could be reduced 
to a linear programming problem which, however, is at best only sug- 
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gestive in identifying optimal policies. We shall now state an elementary 
probabilistic result which implies that minimizing the probability of 
blocking, and maximizing the fraction of events that are successful at- 
tempts, are equivalent. This fact permits a direct dynamic programming 
approach through Markov decision processes, and again leads to a linear 
programming problem, with the difference, though, that it actually 
enables us to study optimal policies for many cases, to be described. 


Theorem 8: Let p be an equilibrium probability vector for a process x; 
resulting from use of some rule R € C. Let 


m = >. |x| pr = average number of calls in progress 
XE8 


then both 
es Pi) re, 
Py Pe” 
m 
and 
: ] 
deel weT LEY 


Proof: For the first formula with s = {s(x), « € S} 


Pr{bl} = p(B+r) _ p(B +r) 
p' ax DAS = aT) pp (B+ 7) 
and \p (s ~— r) = m, since the average rate of successes must equal that 
of hangups, in equilibrium, anda = 8 + s. 
The second quantity is 


y 
average rate of successes _ Ap (s — 7) _ m 


average rate of events m + Ap’a 2m + Ap’ (6 + 1)” 
An immediate consequence is: 


Theorem 4: Maximizing the fraction of events that are successful attempts 
ts equivalent to minimizing the probability of blocking. 


The value of the preceding observations is that we can now reformu- 
late the routing problem as an effort to maximize 


lim 2 {number of successful attempts in n events}, 


nn >a 
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the asymptotic rate of successful attempts when time is counted dis- 
cretely, by events. 

Since only events are at issue, and the epochs at which they occur are 
irrelevant, we can discard the continuous parameter Markov process 
{v,, ¢ real} in favor of a Markov chain {z, , n an integer}, with a transi- 
tion matrix A = (dz,) = A(R) given by 


A(x + Tex) c= Y, 


1 YC Bz; 
[|x] + Aalary = 
Ney y€ Az, 
0 otherwise. 


The stationary vector q satisfying q= g@A is related to p by 


= ee: eee 
pz = (constant) aeasrs 
Then 
n-1 
E{number of successful attempts in n events} = >) A’ 
7=0 
where A = A(R) andv = v(R) given by 


stots As(x) — Mex 

[2{+rAe, ’ 

chance that first event to occur 
starting in x is a successful call. 


(9) 


Thus, the problem of optimal routing can be cast in the form of the 
Markov decision processes studied by e.g., R. Bellman’ and R. Howard: 
For R € Cand A = A(R) = (ary) given by 


\ (Bz + fez) c= Y; 
1 y€ B, 
({x2| + rAae)Gey = 
Mey Yoo Bes 
0 otherwise, 


the minimum 
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imnePnene= au 


REC REC pa 


subject to p'Q = 0, p = 0, p'l = 1 is achieved by the R which maxi- 
mizes the scalar p such that 


n—l 
(eine OW ees, AAA, Bee 
=0 


n>o Nj 


with v given by (9). 
The results of Bellman in Ref. 10 were derived under the strong posi- 
tivity condition a,, 2 d > 0 on the matrices A; this condition is of 


course not met in our routing problem, since many az, necessarily vanish. 
However, since our matrices have only one ergodic set it is still possible 
to obtain results like Bellman’s provided only that a little care is taken 
with the transient states. 


Lemma 8: Let p be the scalar defined by 
n—1 
pl = max hae >> A, (10) 
REC n>0 NM 7=0 


let R* achieve the maximum in (10), and let g be the vector determined" up 
to a multiple of (the vector) 1 by the equation 


pl +g = 0(R*) + A(R*)g. 
Let R* achieve the maximum in 
max {v(R) + A(R)g} 
REC 
Let F be the transient set of states relative to A(R*). Then the restriction 
of g to S — F satisfies the nonlinear equation 


p+ go = max {v.(R) + Dd G(R}, 2«—€ S—F, (1) 


and the right-hand side of (11) depends in fact only on 
{Jy »Y ee bs 


Further, there is a fixed routing matrix R**, agreeing with R* on (S — FY)’, 
and a vector g* agreeing with g on S — F, such that R** achieves the maz- 
imum in 
pl + g* = max {v(R) + A(R)g*}. 
REC 


Proof: If the nonlinear equation given does not hold for some x € S — F. 
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there exists a vector ¢ with ¢ # 0, ¢ = O such that on S — F 
po be FE ge = Max {v,(R) ae = Azy(Ie)gy}, 
REC y 


= v,(R*) + >) ay(R*)g,- 


Let us restrict all vectors to the | S — F | components present in S — F, 
and the matrix A (R*) to (S — F)’. Then, dropping dependence on R* 


pl+¢é+g=0-+ Ag. 
There exists an integer k such that A* > 0 strictly. Left-multiply by 
A* and note that Al = 1 to obtain 
pl + A*(S +g) = Atv + Ag. 
Since A” is a positive matrix, and ¢ ~ 0, ¢ = 0, there exists a scalar ¢ 
such that A*t = el, so that 
(o+e)1+ A’g S Av + AMG. 


Iterating this inequality n times we obtain 
ktn—-1 


nptel+ Ag s > A‘y + A’*"g, 


For n large enough this contradicts the maximal character of p. To find 
R** and g*, consider the equation 


g.* = —p + max fo(R) + y Azy(R)gy* + 2, On(R)gy}, « € F. 


This represents the expected best possible fortune of a gambler who 
starts broke in state x € F, plays by choosing a matrix R paying an 
amount p to play, receiving v,(R) if he plays R in state x, and ending 
the game with a final payoff of g, if the system leaves F for the first time 
by going into y € S — F;1.e., if he passes through riz +--+ ty playing 
Rik, --- Ra (with R; in 2;), going out to y € S — F from 2, , then he 
recelves (or owes) 


—np + Do v2,(Rs) + oy. 


It is apparent that {g.", « € F} exist; R™* on F’ U (F X S — F) is de- 
termined by the property that it achieves the maximum above, and on 
(S — F) X F it is zero. 


Lemma 4: Let p be the scalar defined by the condition 
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1 n—-1 ; 
pl = max lim— >> A”z, (12) 
REC n>0 1 j=0 
and let the vector g be a solution of the nonlinear inequality 
pl +g S max {o(R) + A(R)g}. (13) 
€ 
If R* € C achieves the maximum on the right of (13), then tt also achieves 
that on the right of (12). 
Proof: R* and g are related by 
pl+g Sv(R") + A(R*)g, 


whence, left-multiplying by A’? = A’(R*) and summing on j from 0 to 
(n — 1), 


n—1 : n—1 : n : 
nmpl+ >, Ag Ss > Av + D5 A% 
7=0 j=0 j=l 
n—l 
ae eps AZO) A:ARY. ga gtR. 
= 


This implies that R* achieves the maximum in (12). 


XIII. OPTIMALITY OF FIXED RULES 


If a routing matrix has any entries other than integers, its use intro- 
duces a certain amount of additional randomness into the operation of 
the network, over and above that due to the random traffic, and may be 
said to represent a “mixed” strategy. It is a natural intuition that since 
minimizing the probability of loss is a game played against nature, rather 
than against an intelligent adversary, there can be no real gain from this 
additional randomization, i.e., that a fixed rule can be found that is as 
good as any “mixed strategy”’. To this effect we formulate 


Theorem 8: A fixed rule R achieves 
/ 
min 2 as r) 
p' a 
subject toR € C,pQ=0,p1=1,p =0,Q = Q(R). 
This theorem is a consequence of the next two results, which, though 


they are adapted from work of C. Derman,’ are included here for com- 
pleteness. 


Lemma 6: Let £(-): C > E*! be an affine map of C into | 8 | - dimen- 
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stonal Euclidean space, 1.e., one such that for real scalars a, ,d2 = 0 with 
a; + a = 1l,andR,, Ro € C, 
E(aikt, + ake) = ait (ti) + a(R), 
and let — be continuous. Then, 
min q'é 
subject tog = 0,¢1=1,¢A =@,A = A(R), & = E(R) is achieved by 
a fixed rule R. 
Proof: For R € Cand A = A(R), & = E(R) set 
»(R) = lim + > Abe. 


n>o Tj 


By a known Markov chain limit theorem,” v(R) is well-defined. For 
uw € (0,1) let 


V(Ryu) = » (wA)’é. 

= 
It is clear that for each » € (0,1), and each starting state «, there exists 
an Ruz € C 

Vi(Ruz,u) = min V,(R,p). 

REC 

Then 
Vi haes) = min {&.(R ) a Uh 2 Azy(R)Vy( Ruz, u)}. 
REC yes 

The right-hand side is an affine functional of R and so assumes a mini- 
mum at an extreme point of C, 1.e., at a fixed rule R. Thus, we can con- 


sider that R,, 1s a fixed rule. Since the fixed rules form a finzte class, 
there exists a sequence pn — 1 and a fixed rule R* such that 


Rea m=1,2,°-°-. 
By a well-known Abelian theorem,” for R € C 
im (1 — »)V (Ru) = v(R) 


pol 
and also 
> v(R*). 


Thus, R* is optimal. 
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Theorem 6: Let ¢,n: C > E'*! be affine maps of C into | S | - dimensional 
Euclidean space, and let — and 7 be continuous, with n(R) > 0 for R € C. 
Then 


/ 
b = min LS 
qn 


subject tog = 0,7/A = 9,71 = 1,4 = A(R), — = E(R), and n = 0(R) 
1s achteved by a fixed rule. 


Proof: Let b(R) be the value of q'é/q'n for a given choice R, with q 
determined by the constraints g = 0, g’'A = g, 71 = 1. There exist 
R,,Re,--- € C such that 


lim b(R,) = b. 


For n fixed, let €(-) in Lemma 5 be given by 
g=¢0- b(Rn)n. 


Then in the notation of Lemma 5, v(R,) = 0. By Lemma 5 there exists 
a fixed rule R,,* such that 


v(Rn*) S (Rn) 
= 0, 
that is, since gn + 0, 
b(R,") S b(Ra). 
Since there is a finite number of fixed rules, there is a subsequence 
m,%,-°-: and a fixed rule R* such that R,,* = R*,7 = 1,2,---. 


Then R* is optimal. 


XIV. TRYING TO GET CLOSER TO THE OPTIMAL ROUTING RULES 


Jt is particularly important to try to verbalize, and eventually to 
mechanize, routing strategies that are optimal, near-optimal, or by some 
yardstick just ‘“‘good’’. In this endeavor, the fact that the original routing 
problem (7) can be formulated and solved numerically as a linear pro- 
gramming problem, while interesting theoretically and perhaps reassur- 
ing, is nevertheless of limited usefulness. For this reason we have 
attempted to take advantage of some of the special properties of the 
problem that are due to its telephonic origins, and to describe at least 
parts of optimal policies in terms of the combinatorial properties of the 
connecting network upon which they ultimately depend. 
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In the second half of this paper we introduce some additional notions 
and assumptions of a combinatorial nature. With their aid we are able to 
exhibit parts of some actual optimal routing rules. The problem of finding 
out something concrete about optimal policies has been so difficult that 
we have quite frankly started with (and so far restricted attention to) 
cases which can be treated by what T. M. Burford has called “‘domina- 
tion”? arguments, which depend on or establish isotony® properties for 
certain networks having suitable monotone structures. The word ‘mono- 
tone’ is used loosely here: more specifically, the networks are to have the 
property that the relative merit of states is consistent or continuous, i.e., 
that if one state x is “better” than an equivalent state y, then the nezgh- 
bors of x are in the same sense “better” than the corresponding neigh- 
bors of y. 

Although some of the combinatorial properties (on which the results 
to be given are based) are strong, we believe that these properties and 
the optimal policies (or partial policies) they lead to have a definite 
relevance to the practical aspects of optimal routing, if only because 
they bear out some of the intuitive conjectures offered in Section VIII. 
Our results show not only that these conjectures are “in the right ball- 
park,” but also thet in many instances they describe optimal policies. 

We start our discussion with four simple examples; once the ideas in- 
volved are understood, the principles behind them can be abstracted, 
and general theorems proved. 

It has been shown (Section XII) that minimizing the probability of 
blocking is equivalent to maximizing the fraction of events that are 
successful attempts, where an event is either a hangup, a blocked at- 
tempt, or a successful one. This maximal fraction is the limit, as n be- 
comes large, of 


1 Bn), 
nr 


where 
E,(n) = expected number of successful calls in n events, if the net- 
work starts in state x and an optimal policy is followed. f 
We shall base our approach on the vectors E(n). 


First ecample: Consider the overflow system or grading shown in Figs. 7 
and 8. There are two groups of lines, each of two lines; the first has ac- 
cess to both trunks to the destination, but the second has access to the 
second trunk only. The possible states of this system (reduced under the 


+ Here an optimal policy is one for which the expected number of successful 
calls in n steps is a maximum. 
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LINE GROUP 1 { 
LINE GROUP - 


equivalence relation induced by permuting lines within a line group) 
form the partially ordered system of Fig. 8. There is only one situation 
which demands a choice between alternative routes for a call; it arises 
when a call from line group 1 is accepted with no calls in progress. The 
two alternatives are indicated in Fig. 8 by the notation “ch’’: one is to 
put the call on trunk 1, leaving no lines blocked, the other is to put it on 
trunk 2, leaving 2 lines blocked. 

What circumstances make one choice of a route better than another? 
In the present instance it is clear that use of trunk 1 for a group 1 call in 
state 0 leaves the “high access” trunk 2 free to serve group 2. Thus, at 
first glance a route whose use blocked the smallest possible number of 
additional calls (over and above those that are already blocked) seems to 
be best. It is natural to expect that in state 0 a new call from group 1 
should be routed on trunk 1 and not on trunk 2. Indeed, it can be shown 
that if such a call should be accepted then it should be placed on trunk 1. 
(For small \ it should always be accepted, as was proved in Section X.) 
Thus, a policy which routes a group | call on trunk 1 in state 0 can differ 


TRUNKS 
| 2 


= CROSSPOINT 


Fig. 7— Asymmetric grading. 


(t-1) (1-2) (1-1)(2-2) 
THIS NOTATION INDICATES A 
CH CHOISE IS POSSIBLE BETWEEN 
TWO DIFFERENT WAYS OF PUTTING 
(1-1) (1-2) (2-2) UP A PARTICULAR CALL 

(1-2): A CALL FROM GROUP I 1S ON 

TRUNK 2 
CH 


Fig. 8 — States of the grading of Fig. 7. 
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from an optimal policy only in that it might accept some calls which the 
other rejected, and vice versa. 

Rather than proving the result stated above, we shall discuss other 
examples, involving different kinds of network: it will turn out that 
similar circumstances arise. Indeed, we shall claim that the particular 
circumstance on which the result is based is no isolated happenstance, 
but a phenomenon common enough to be relevant to the theory of rout- 
ing. All examples discussed here, as well as many others, will be covered 
by a general result (Theorem 14) proved later. 


Second example: Referring to Fig. 2, which shows the reduced state di- 
agram of the three-stage Clos network of Fig. 1, we observe that only in 
the state numered 4 are there any blocked calls. State 4 realizes the same 
assignment of inlets to outlets as state 2, which has no blocked calls. The 
difference between the two is that in state 2 all the traffic passes through 
one middle switch, leaving the other entirely free for any call that may 
arise. This difference illustrates the intuitive rule that one should always 
put a call through the most heavily loaded part of the network that will 
still accept it. This example was discussed in Refs. 1, 2 where it was shown 
(rather laboriously) that if no calls are rejected, then preferring state 2 
to state 4 in state 1 is optimal. This result will be an instance of Theorem 
14. 


Third example: It is to be expected that in some instances a choice of 
route for a call is immaterial. The concentrating switch depicted in Figs. 
9 and 10 is a simple example of this phenomenon. It is intuitively ob- 
vious that, because of the symmetries of the network, it makes no differ- 
ence which of the two trunks a call could use when the system is empty 
is assigned to it. This insensitivity of performance to routing choices 
can actually be deduced from Theorem 7. 


LINE GROUP { 
LINE GROUP + 


Fig. 9— Symmetric grading. 


TRUNKS 
1 2 3 
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(1-1)(1-3)(2-2) (1-1)(2-2)(2-3) 


(1-1)(2-3) (1-1)(1-3) (1-3)(2-2)  (2-2)(2-3) (1-1)(2-2) 


Fig. 10 — States of the grading of Fig. 9. 


Fourth example: Figs. 11 and 12 show the structure and (reduced) state 
diagram for another simple Clos network made of 3 X 3 inlet and outlet 
switches, and 2 X 2 middle switches. Again, from scrutiny of the state 
diagram we guess that optimal routing will result if no empty middle 
switches are used when partially filled ones are available. The notations 
‘B’ in Fig. 12, intended to suggest that the states to the left of the B’s 
are “‘better’ than those on the right, constitute an expression of the cor- 
responding policy, and are explained in the next paragraphs. 


2x2 


Fig. 11—3-stage Clos network with 3 X 3 outer switches. 
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SSE 


Fig. 12 — States of 3-stage Clos network of Fig. 11. 


To abstract the essential features of the preceding examples into a 
general theorem, we start with the observation that in choosing to enter 
a state x rather than another y in putting up a call we have always to 
choose between equivalent states (x ~ y, in the sense of Section ITI), in 
which the same events e can occur. In particular, the same new calls c 
can arise. If it now happens that every new call blocked in z is also blocked 
in y, let us regard this as prima facie evidence that x is somehow ‘‘better’’ 
than y, and define a relation B € S* by the condition 


xBy if and only if « ~ y and 
c € x,cblockedinz imply c blocked in y. 


The relation B is a partial ordering. 


OPTIMAL ROUTING IN TELEPHONE NETWORKS 1417 


In the first example considered above, (1-1)B(1-2), and B obtains 
between no other distinct states; in the second, 2B4, and again B ob- 
tains between no other distinct states. 

Let us now suppose (for a general network with state set S) that the 
network is run according to a policy g, and ask what happens to B under 
¢. That is, more specifically, we look at states x,y such that xBy, and we 
consider, for events e that are either hangups or new calls blocked in 
neither x nor y, whether or not 


¢ (er) Bo (e,y). 


If e occurs and ¢ is used for decisions, then the system moves from x 
to g(e,x) and from y to g(e,y). If g(e,x)Be(e,y) for alle € x that are 
either hangups or new calls blocked in neither « nor y, whenever xBy, 
we say that ¢ preserves B. Formally, 
¢g preserves B if and only if xBy implies ¢(e,z7)Be(e,y) for 
e € x which are either hangups or new calls 
blocked in neither x nor y. 

In the first example (Fig. 8) there are no new calls c which can be 
put up in both (1-1) and (1-2), and there is one hangup (say h) which 
can occur in both. Thus, the set of events to be considered is just {h}. 
Clearly, g(h,1-1) = y(h,1-2) = 0 state for any g. Since B is reflexive, 
we conclude that in this case every ¢ preserves B. 

In the second example, a similar situation arises. There are two 
events to be considered: one is a new call completable in both 2 and 4 
leading to state 6, the other is a hangup leading to 1. Again 


e(e,2) = (e,4) 


for all y and both events e to be considered, and again any ¢ preserves 
B. 

As noted, routing has no effect in the third example. However, the 
relation B is defined. It can be verified that any ¢ preserves B, and that 
in this case B is a symmetric relation, as it should be, since if routing 
is to have no effect, then x can only be “just as good” as y if y is ‘just 
as good” as x. These facts can be used to prove that routing has no 
effect in this example. 

The fourth example, finally, shows the relation B in action. The 
notations 


x--B--y «x,y states 


in Fig. 12 show the zrreflexive part of B. (Obviously xBz for all x € S, 
and this part of B is not shown in Fig. 12.) The reader is invited to 
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verify that the policy ¢ of using a partly-filled middle switch whenever 
possible does indeed preserve B in this example. 

The property of a policy 9, that it preserves B, is to be viewed as a 
kind of zsetony of ¢: 


«By implies ¢(e,v)Bo(e,y), for suitable e. 


(See G. Birkhoff,® p. 3.) It can also be viewed as a kind of continuity, 
for after all if we think of the set of nezghbors N, of y as the states in 


Ny = Ay U By; 


then the property says that if «By then also zBw where z is a neighbor 
of x and w a neighbor of y such that z ~ w. In other words it states 
that if ~By then also 


(Nz x Ny) a (~) Cc B, 


ie., if it holds between x and y then it also holds between equivalent 
neighbors of x and y. 
Note that if g preserves B, xBy, and ¢ rejects in x a call c not blocked 
in y, then it also rejects it in y. 
Tor ¢ a policy, let 
E,(n,e) = expected number of successful attempts in n events, 
if the network starts in state « and policy ¢ is 
followed. 
The isotonic property that » preserve B has the useful feature that it 
implies an isotony among the numbers 


{E,(ne), nZzi, x € S}. 


This is the content of the next result. 

Theorem 7: (First Isotony Theorem): If 9 preserves B, then xBy implies 
Ez(ny) 2 Ey(ny), n=1,2,---. 

Proof: «By, c € x, g(c,y) # y imply g(c,z) # x. Hence, 


De. dee ao 
cea cey 
g (¢,2)=z g(ey)=y 
and E,(1,¢) 2 H, (1,¢). As a hypothesis of induction assume that «By 
implies 
E,(ng) 2 E, (ny) 


for some n = 1. We have 
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r 
E,(n + 1,¢) = » eee {1 - Eg ¢e,2) (ny) } 


9 (¢,z) Az 


E.(ny) 2 a E,a(n,g). 


anes cez + aad 


¢(c,2)=2 
Since ¢ preserves B, it must be true that «By implies 
g (c,x) Be (c,y) 
(x — h)Biy — h), 


whence 
Lgce2) (mp) 2 Lovey (ne) 
E,a(ne) 2 Hy-+(n,¢). 
Therefore, 


nN 
E,(n + 1)y) = 2d eas ha {1+ Eygce.y (ne) } 


9 (c,y) Fx 


E,(ny) Dae 1 


cey 
eg (c.y)=y 


2, Ey-n(ny ) 


nN 
= [2 + Nay 


a se lyl + Ay ; Nay ye 


= Ly(n or ly). 


The power and utility of the relation B are further illustrated by the 
following comparison theorem for policies. The partial ordering B on S 
induces a natural partial ordering B of the policies according to the 
definition 


V 


gBy =e €x,x€ S imply o(ex)Byle,z) 


for e a hangup or a call not blocked in 2. We note that By implies that 
gy and y embody the same rejection policy. 


Theorem 8: If eBy, and one of 9, preserves B, then xBy implies 
L,(n,ge) = E.(n,), oe eee 


Proof: @ and y have the same rejection policy, so H(1o) = E(1,v), 
and the theorem holds for n = 1. Assume as a hypothesis of induction 
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that «By implies Z,(n,o) = E,(n,W) for a given value of n = 1. We 
have, with pez = Pr{e occurs in z}, 


Ey(n —- Ly) = Ei, ( 1p) ale 2d Deyltotey) (7,0). 
e€y 


But e € y implies ¢(e,y) By (e,y), and so by the induction hypothesis 


Ey ey) (n,~) 2 Eyvey) (nw). 


However, 


E,(n + 1y) Ey, (1) — 2d Deyliy (ey) (NW) 


IIA 


E,(n fe ly). 


Let now «By, and suppose that » preserves B. The isotony theorem 
_ then implies 


E,n+1,¢) 2 E,(n +1, ¢) 
Ey(n + 1, W). 


IV 


Tf, instead, » preserves B, then 
E.(n+1,¥) 2 Ey, + 1, p) 


and a repetition of the first part of the argument above with x instead of 
y gives 


E,(n+ 1,9) 2 E.(n + 1, p) 
E,(n + 1, ¥). 


IV 


XV. SECOND INTUITIVE APPROACH 


In an effort to develop a more general theory than the one that was 
begun in the previous two sections, we now make a fresh start at under- 
standing the structure of “good” routing; again, we begin with a special 
case: 


Fifth example: We choose the overflow system or grading depicted in 
Fig. 13. There are two groups of lines, one of two lines, the other of three 
lines. Each has access to one primary trunk to which the other does not 
have access, and they share a single common overflow trunk. The possible 
states of this system form the partially ordered system shown in Fig. 14. 
Alternative ways of putting up particular calls are marked with “ch”, 
for “‘choice’’. 

After inspecting the system and its state diagram, intuition tells us 
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LINE GROUP 1! { 


LINE GROUP 2 


TRUNKS 


! 2 3 


Fig. 13 —Second asymmetric grading. 


that, as a first guess, calls should use the primary trunks whenever they 
can, so as to leave the overflow open as much as possible. Let us, on this 
basis, formulate some preferences for certain routes. 

Clearly, in state 0 a call from group 1 should go on trunk 1, so in state 
0 we prefer state (1-1) to (1-3); similarly we prefer (2-2) to (2-3). The 
same principle should apply if certain calls are already in progress. 
Thus, in state (2-2) we prefer (1-1) (2-2) over (1-3) (2-2), and in state 
(1-1) we prefer (1-1) (2-2) to (1-1) (2-3). 

If taken seriously and followed, the preferences listed above define a 


TWO CALLS 
BLOCKED 


(1-1)(1-3)(2-2)4--7 "x (1-1) (2-2) (2-3) 
tf 








ONE CALL 
-7 BLOCKED 


(1-1)(1-3)  (1-3)(2-2) (1-1)(2-2) G-1e-3)- eres 
We 
St \ 
(1-1) (1-3) (2-2) (2-3) 
CH CH 
° ALL OTHER STATES HAVE 


NO CALLS BLOCKED 


Fig. 14— States of the grading of Fig. 18. 
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policy for putting in calls. We shall show that this policy differs from the 
optimal policy only in that the latter may reject some calls, while the 
former accepts all unblocked calls. To do this write xPy if state x is 
preferred to state y. Thus, the relation P is defined by the conditions 


(1-1) P (1-3) 
(2-2) P (2-3) 
(1-1) (2-2) P (1-3) (2-2) 
(1-1) (2-2) P (1-1) (2-3). 


We let 
E,(n) = expected number of successful call attempts in n 

events, if the system starts in state x and an optimal 
policy is used. 

It must be explained here that by “use of an optimal policy” over n 

steps we mean simply that we use a policy which will maximize the 

average number of successful attempts among those n events; the policies 

that achieve this may, for all we know at this point, be different for 

different n. 

A slight departure from the probabilistic model of Section VI is 
necessary here: we assume that an idle line generates calls to the trunk 
destination at a rate \ > 0, instead of assuming that an idle inlet-outlet 
pair generates calls at \. Also, we let a, be the number of idle lines in 
z, rather that than that of idle inlet-outlet pairs, and s(x) that of 
idle lines that are not blocked. 


Theorem 9: If «Py, then 
E.(n) 2 Ey(n) nm = 1,2,3,-+--. 
Proof: 


rs(a) 


E,( 1) = [zl + Ao 


and xPy implies s(x) 2 s(y), so the theorem is true for n = 1. Assume 
that the theorem holds for some n = 1. There are four cases, correspond- 
ing to the four conditions defining P. We shall give the argument for 
the case where 


Il 


x = (1-1) @-2) 


(1-3) (2-2), 


ll 


Y 


and (as we know) xPy; the others are similar. 
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Now apparently 


EFaane-2(n+ 1) = {Hie (n) + Haa(n)} 


FR 


rd 
BB WY {Hoa e-2)(m), 1 + Bo-aya-sye-2)(n)} 


2r 
+ ~—— max {Baye2(n), 1+ Baae-2 2-3 (n)} 


2+ 3A 
and 
1 
Eas -2(n +1) = TTR {Be2(n) + Ha-s(n)} 
nN 
+ _— max {Hs ¢2-2)(), 1 + Baaa-se-2(n)} 
ie a 9+ 3) = 3r Eiq-3) (2-2) (n). 


By the induction hypothesis, 
Ea. (n) 2 Ea-s) (n) 
Basaye-2(n) 2 Easy 2-2) (n); 
hence, 
E,(n +1) 2 E,(n + 1) 


for the given x and y. 

The point is that each event that can occur leads to a “worse’’ state 
in y than it does in x. Thus, the hangup of the group 1 call leads both 
to the state 2-2, a standoff; hangup of the group 2 call takes x into (1-1) 
and y into (1-3), and (1-1)P(1-3); one of the possible new calls leads 
both x and y to the state (1-1) (1-3) (2-2), another standoff; the other 
two possible new calls are blocked in y but not in z, so that by the 
induction hypothesis, rejecting one of them and staying in z is at least 
as good as having one of these blocked calls make an attempt in y. 

We conclude from Theorem 9 that in an optimal policy the calls 
which are not rejected are put on the primary trunks if these are avail- 
able, and on the overflow only if the primary trunk appropriate to the 
call is already busy. This result is entirely in agreement with our original 
intuition. 

Another example of the same kind is shown in Figs. 15 and 16: the 
intuitive preferences shown in Fig. 16 by ‘P’ are optimal to within 
rejection of unblocked calls. 
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TRUNKS 
| 2 3 


t 
LINES— 2 


3 


Fig. 15— Third asymmetric grading. 


We now formalize the principles behind the intuitions that led to 
Theorem 9. 

Let P be a relation on 8, i.e., a subset of S’. We may as well put our 
cards on the table and indicate that P is to be interpreted as a relation 
of “preference”, with xPy meaning “x is preferred to y’’. If p is a func- 
tion, and X,Y are sets, the (customary ) notation 


pr XOoY 
means that » takes X into Y in a one-one manner, while 
Be xX—-Y 


means that the u-image of X is contained in y. 
We say that P has the strong monotone property if xPy implies 
(@) |x| =|y| 
(it) du: B, << By, such that z € B, implies zPuz 
(vit) Av: A, — A, such that 
v(Ay) GCAce for cE y, 


se (14) 
z2€ A, implies vzPz. 


Let us denote by F, the set of all calls which are free or idle in 2, i.e. 
F, = {e: cisidleinz} = {fy(y— 2): y € Az} 
={e c= {(uv)} CTI X Q with u,v both idle in a}. 


We say that a relation P on S has the weak monotone property if «Py 
implies 
@) |x] =ly| 
(it) du: B,< B, andz € B, implies zPuz 
(wit) Av: F,—F,andc € Fy,2€ Ag 


imply dw € Agee with wPz. (15) 
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To get the weak monotone property from the strong, define v on 
F, by 


vy(2@-—y)=yez—2), 2€ Ay; 
then z € A. implies vz € A,z, and 
ve = y(ve — 2); 
thus, 


V2 E A we) and veP2. 


Keeping in mind the interpretation that ‘xPy’ means that x is in 
some sense better than y, we see that: condition (2) restricts P to hold 
only between states of the same norm or dimension, because we are 
interested only in choosing between states with the same number of 
calls in progress; condition (77) says roughly that to every hangup lead- 
ing out of state y there corresponds a hangup in x leading to a state 
which is at least as “good” (as the one reached by the hangup in y); 
condition (27) says that for any way of completing a new call c in y 
there is a way of completing the same call c in x which leads to at last 
as “good” a state (as the one reached by completing that call in y). 

It is easily seen that P has one of the monotone properties if and only 
if Py implies that P holds between “corresponding respective’’ neigh- 


(1-1)(2- 2)(3-3) 


(-1) B. (t-f) B. (1-2) B- (1-3) (’-1) B. (1-2) (2-2) 
(2-2) (2-3) (2-3) (2-2) (3-3) (3-3) (3-3) 









(1) Oe (1-2) Be (1-3) (2-2) 8, (2-3) (3-3) 


Fig. 16 — States of the grading of Fig. 15. 
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bors of x and y. Thus, the monotone properties are similar to the prop- 
erty of a policy ¢ that it preserve B. The principal differences are that 
here no policy is at issue, and that the meaning of ‘corresponding neigh- 
bor” is weaker than in the definition of preservation. The relationships 
to the relation B are further clarified in the following remarks. 

If P has the weak monotone property, then xPy implies s(x) 2 s(y). 
If P has the strong monotone property, then xPy implies that every 
c € x blocked in x is blocked in y. Further, since we are primarily inter- 
ested in comparing equivalent states (i.e., x and y such that x ~ y), it is 
natural to restrict attention to preference relations P which are subsets 
of ~, P € ~. It can then be verified that if P has either monotone 
property, and holds only between equivalent states (P € ~), then 
PGB; 

A “preference” relation should impose at least a partial ordering 
among the objects for which it is defined, and so it is by nature transi- 
tive. The question then arises whether the relations P that have the 
(strong or weak) monotone property are reflexive and transitive. It is 
obvious that if P has the monotone property then so does J U P where 
I is the identity relation. Now, as is known, every relation P can be ex- 
tended uniquely to its transitive closure P, the smallest transitive rela- 
tion containing P. We shall now prove: 


Theorem 10: If P_ & S’ has the weak monotone property, then so does its 
transitive closure P. 


Proof: Clearly P = P U P? U P* U ---, where the powers represent 
relative, not Cartesian, products. It is obvious that «Py implies | x | = 
|y |, so P has property (7) of (14). Next let «Py, so that there exist 
21, 22, °°', 2m € Ssuch that 2 = 2,2, = y and 


iP2isa pel, o 5% = As 


Thus, there exist maps pi, we, °** » Waa With u;: B,, © B.,;,, such that 
z € B,, implies 


zP. Wid. 
Hence, z € B, implies 
2P uz 


meP Belz 


Un—2Un-3 °° * M12P pn—1en—2 vo pie, 
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Le., 
_ n—l 
zP (1 mi)e 
7=1 


Thus, 


has the property that u: B,< B, and z € B, implies zPuz. Hence, P 
has property (27). Finally, there exist maps 1, --+ , 1 With m_;: 
F.,,, 2 F., such that c € F.,,,, 2 € Acs, implies w € Agyeyecg: 
with wPz. Let 


n—l 
p= [] Vy. 
i=1 


Hence, for each c € Fy, Wa € Ac there exist Wn1,°':, Wn € S and 
Cn—1°** Cn—1Such that 


Ci = ViCi41, Wi E Aes W:Pwisr = 1, ce sn—l. 


It is apparent that c, = vc, wi € Agus and w;Pw, , so that P has property 
(220). 

The following result is now immediate: 
Theorem 11: lf P has the weak monotone property, and I 1s the identity 
relation, then 


(UP) 


is a partial ordering relation with the weak monotone property. 

Any relation with the weak monotone property can be extended to be 
a partial ordering P that has the weak monotone property. Since ~ is an 
equivalence relation between states, and P is a partial ordering, it fol- 
lows that P N ~ is also a partial ordering. 


Theorem 12: (Second Isotony Theorem): If P © S° has the weak mono- 
tone property, then 
xPy implies E,(n) 2 E,(n), n=1,2,---. 
Proof: Property (15) (iz) implies that s(x) 2 s(y) whenever xPy. 
Now 
As(xz) 


E,(1) = [z| a . 
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Since it is assumed that a, = ajz; we have, by (15) (2), 
xPy implies #,(1) 2 E,(1). 


As an hypothesis of induction assume that «Py implies E,(n) 2 E,(n). 
We have 


E.(n+1) = aah 2, max {Ei.(n), gle) + “ E.An)} 
aii |x| + Aa, . Nos fe 2 E,-+(n), 
and a similar expression for Z,(n + 1). If now «Py, then |x| = | y| 


by (15) (2), and also 


Ez-1(n) 2 Eya—w (nm) 
by (15) (7) and the hypothesis of induction. Similarly, 
a rhe 


For c not blocked in y, and z € Ag, oe Scie that there exists w € 
Awez With wPz, by (15) (tt). By the hypothesis of induction, this 
implies that 


Ew(n) 2 E.(n). 
Since z2 € A. was arbitrary, we find 


g(ve,x) + max y(n) 2 g(cy) + max E,(n). 
zEAcy 


WEA (y¢)¥ 


It follows that «Py implies E,(n + 1) 2 E,(n + 1). 


XVI. RELEVANCE OF THE ISOTONY THEOREMS TO OPTIMAL POLICIES 


Let c € x be a call that is not blocked in state xz, so that A,, ¥ 6.1f 
the hypotheses of one of the isotony theorems obtain, then it may be pos- 
sible to single out some of the states y € A... as providing ways of complet- 
ing c in x which are at least as good as certain others. Specifically, the 
sort of comparison we can make is this: If y,z € Acc and yBz or yPz, then 
y is at least as good as z in the sense that 


E,(n) 2 E,(n), n=1,2,--- 


Suppose now that there is at least one y € Ace such that yBz for all 
z2€ Acz. It then follows that such a y is always at least as good a choice 
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as any other state of A.., in the above sense. A similar result follows if 
there isa y € A.z with yPz for all zg € Ac: . In such situations a policy that 
routes c so as to take the system from x to y can differ (so far as x and c 
are concerned ) from an optimal policy only in the respect that an optimal 
policy might reject c in x. This is the sense in which the isotony theorems 
can provide the part of the solution of the routing problem which has to 
do with choosing routes for accepted calls. Two theorems to this effect 
appear in Section XVIII after an aside about equivalence of decisions 
and nonuniqueness of optimal policies. 


XVII. EQUIVALENCE OF DECISIONS AND NONUNIQUENESS OF OPTIMAL 
POLICIES 


It is natural to expect that there are often several optimal policies, in 
the sense that, for some c and x with c € x and A,, ¥ 6, there are two 
choices of a route for c in x which are in some sense distinct routes and 
yet are both equally “good”. For example, in most traffic models for a 
graded or progressive multiple it often does not make any difference 
which trunk in a group is used for a call: the possible states resulting 
from use of one of the trunks in the group are all distinct, yet all are 
equally “‘good’’, being “equivalent” under permutations of trunks within 
the group. It is intuitively clear that such a nonuniqueness of optimal 
policies is due in large part to symmetries in the network under study, 
or more generally, to the presence of various equivalences of states (and 
hence of routing decisions) under certain groups of permutations of 
terminals.t Since some of these equivalences appear in a later proof, we 
digress a little for an account of them, first heuristic, then formal. 

As we have seen, one of the principal tools in the description of optimal 
policies is a combinatorial partial ordering, such as B or P, which implies 
an ordering in terms of performance. The discussion to follow is based on 
a general partial ordering R, which the reader can assume is contained 
in 

U An 
cen 
and which he can interpret as B or P, if he wishes.t 

Let then R& be a partial ordering of S and let Y be a subset of S. Cued 
by the remarks of Section XVI, we want to use 2 to compare states; in 

f It should be noted that the word ‘group’ is used in this paragraph in two tech- 
nical senses, the first from traffic theory, referring to a set of trunks, the second 
from the theory of groups. 


t This use of ‘hk’ is peculiar to this section, and should not be confused with R 
as a routing matrix. 
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particular we wish to talk about elements y € Y such that yRz for all 
z € Y.It would be satisfyingly simple if at this point we could introduce 
the notation 


sup Y 
R 


for that element of Y which bears R to every other element of Y. Un- 
fortunately this is usually impossible, because there may be several or 
many such “suprema” of Y. In this situation the usual mathematical 
trick to use is to pass to suitable equivalence classes. Use of this pro- 
cedure is further justified by the fortunate fact that, in the case of several 
interesting choices of R and Y, there are several senses in which these 
maximal elements turn out to be equivalent. What is more, there is a 
natural equivalence based only on R, such that sup Y can, if it exists, be 
R 


defined in the ‘‘quotient” set of the equivalence, i.e., in the image of the 
semilattice homomorphism that takes each state into the equivalence 
class to which it belongs. 

If R = P and P has the monotone property, then all the P-suprema of 
Ac are equivalent in this very important sense: If y1,°-+ , Ym is an 
enumeration of all the y € A.z that are best in the sense that yPz for all 
2 € Acc, then 


YiPy; , Le472m 


and the second isotony theorem gives 
Ey,(n) = Ey; (n) n=1,2,---, (16) 


so that as far as performance is concerned, y1, °°: ,Ym are all “equiva- 
lent’’. In many cases, this fact is based on an underlying equivalence of a 
combinatorial nature, much stronger than (16): e.g., in a trunk group 
the different states attainable by different choices of a trunk for a call are 
equivalent in the sense that given any two there is a renaming or per- 
mutation of the trunks which carries one into the other. 

The isotony theorems provide ways of translating a combinatorial 
comparison of states such as 


xBy, or xPy 


into a numerical comparison of the relative merit or value of starting in 
each state, x or y. In such a setting it is natural to call x and y ‘‘equiva- 
lent’ if the comparison holds both ways, 1.e., if, when interpreting 
‘cBy’ as a (rather strong) precise form of ‘x is better than y’, we have 
both 
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xBy and yBx. 
Lemma 6: Given two states y,z there exists at most one pair c,x such that both 
yz © Ac. 


Proof: If y,z € Acz thenz = y fz in the sense of the semi-lattice of states. 
Thus, x is unique. If now 


Ye E Ags a} Acts 


thence = y(y —2),¢ =y(y—2),soc=c. 

The foregoing observations are the motivation for the ensuing de- 
velopment. With the partial ordering R we associate the natural equiva- 
lence relation =p defined by 


z=,y ifandonly zRy and yRz and BAw y,2e Acc. 


The subscript R will usually be dropped as long as it is contextually 
clear what R is being used to define =. Along with = we introduce the 
semilattice homomorphism 

r(-): S — {equivalence classes of =} = S/= 


defined by r(a) = {2: z2=da}. 


The image 7 (8), i.e., the “quotient” set S/=, is partially ordered by the 
relation R defined by 


t(x)Rr(y) if and only uv uw € r(v%) and v € rly) and uh. 


This is the natural homomorphic “contraction” of R to S/=. It can be 
verified that if 7(«)Rr(y) and r(y)Rr(x), then r(x) = r(y) strictly. » 

If now Y contained in S is such that there exists a y€ Y with yRz for 
every z € Y, we use the notation 


sup Y (17) 
R 


for r(y). It is clear that in the “quotient” space, an element maximal 
with respect to # is unique if it exists at all. Strictly speaking the notation 


sup 7Y 
TR 


would be better, since it indicates that the supremum operation only 
makes sense after the homomorphism. However, (17) will be used, with 
the reminder that it is a set, not a state, and the convention that use 
of (17) implies the assumed existence of maximal elements. 

With the notation (17) we can prove the following natural relation- 
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ship between the strong monotone property and the notion of preserva- 
tion of B. 

Theorem 13: Let 


€ sup A.» for e=c 
g (e,x) e 
l=x2—h for e=h 


and suppose that ¢ preserves B. Then B has the strong monotone property. 


Proof: xBy implies « ~ y and hence | x | = | y |, so B has property (14), 
(t). If xBy, define for z € B, 


uz = o(y(@ — 2),y). 
Then, since ¢ preserves B 
gy (x — 2),z) Bey @ — 2),y), 
Buz, 

and B has property (14) (27). With «By still, let 

vr Ay A; . 
be given by vz = g(c,x) forz € Avy. Then, since g(c,y)Bw for w € Ag, 

9 (c,x) By (cy) 

Bz, 

so that B has property (14) (27). 


XVIII. OPTIMAL ROUTING THEOREMS 


This final section contains precise statements showing just how the 
combinatorial properties introduced in Sections XIV and XV answer 
the question: ‘‘Which route should an accepted call use? ”’ 

Two policies g and y will be termed equivalent with respect to rejections, 
written o ~ y, if they both reject the same calls in the same states, 1.e., 
if g(c,z) = x when and only when y(c,2) = x fore € x. 


Theorem 14: If 9 preserves B, and if c € x wmplies 
e(ee) € sup Acs 
whenever o(c,c) # x, then 
Ez (n,¢) = E, (nw) n=1,2,--- 
foranyy ~¢. 
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Proof: E,(1,e) = E,(1,y) by direct calculation. Assume as a hypothesis 
of induction that H,(n,e) 2 E.(n,w) forx € S. We have 


E,{n+1g)= {1 + Eyce,2)(n,) } 


ree 
cex |%| + Aas 


9 (c,2) ax 
fies 2 SS Bay) 
| vw | + AQ ces 
¢ (cen) =x 
yee $) 
+i n(19), 


and a similar expression for H,(n + 1,y). If now ¢(¢ev) ¥ x, then 
go(c,x)By for every y € Acz; in particular, y(z,7) ~ x because o ~ y, 
and so W(c,a) € Ace, whence 


9 (ct) By (c,x). 
The first isotony theorem and the induction hypothesis now give 
Ege) (ne) 2 Eyce,2) (ng) 
= Eyex (ny). 
It follows that 
E,(n + ly) 2 E.(n + 1p). 
Corollary: If ¢ preserves B, and 
elec) € sup Ace 
for c € x not blocked in x, then 9 is optimal within the class of policies that 
reject no unblocked calls. 
Theorem 15: If P has the weak monotone property, and 
sup Acs 
P 
exists for each c € x not blocked in x, then there exists an optimal policy R 
such that c € x, y € Ace imply either x is R-transient or else 


= 0 unless y € sup Ac. 
P 


Proof: Let p be the scalar such that 


= A(R). 


aN 
me 
| | 


pl = max lim — ty Ave 


REG now 1 j= 
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We first use an argument of R. Bellman” to show that the vector se- 
quence 


E(n) — npl = g(n) 
is bounded in n. 
By Lemma 3, there is a vector g” which satisfies 
g” + pl = max {o(R) + A(R)g} . 

Choose K >0 so that 

g* —K1 Sg) $g*+ Kl. 
Assume, as an induction hypothesis, that 

g — Ki Sg(n) Sg 4+ K1. 
We have 

gin +1) = —pl + = {u(R) + A(R)g(n)} . 

Hence, 


—pl — K1 + max {o(R) + A(R)g"} S g(n + 1) 


IIA 


— pl + K1 + max {v(f2) + A(R)g*) 


IIA 


g° — Ki Sg(n+1) 


Let now 


g° + K1. 


g = lim sup g(n), 
taken componentwise. Let R, achieve the maximum in 
max {v(R) + A(R)g(n)} . 
REC 
Given ¢ > 0, there exists np such that n > no implies 


gz(n) S gz + € 
for alla € S. Thus, 
v(Rn) + A(Rn)g (n) 


v(Rn) + A(Rn)g + A(Ra)lg(n) — §, 
v(Rn) + A(Rn)g + € 
- {v(R) + A(R)g} + «. 


IIA 


IIA 
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Hence, since e > 0 was arbitrary, 


gt+tpls max {v(k) + A(R)g} . (18) 


Let R* achieve the maximum on the right above. By Lemma 3, R* is 
optimal. Let F be the set of transient states relative to R*. The argu- 
ment used in Lemma 4 shows that equality must obtain in (18) on 


SH, ie, . 
Jz + p= max v(B) =f ye co) P) xEs = F. 
REC yES-F 


This is equivalent to 


r 
2tp= epee ne mange b+ max , 
J 2 oleh nes Pies 
c not blocked in z 
\B29z 
Tena | eT oe pets 


Now the second isotony theorem implies that if ~Py, then 
E,(n) = E,(n), n2Zl 
ge(n) 2g), nel 
Gz = Yy- 
Thus, if c € x is not blocked in x 


max gz 
2€Acz 


is achieved by each and any y € sup Ac. 
P 
Let R be any routing matrix such that for y € Acz 
0 if ye SFP, 


Ley = 


1 onlyif 1+ 9,29. and y € sup Ag. 
P 


Then R achieves the maximum in (18), and so is optimal; it is clear that 
it also has the property claimed in the theorem. 
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APPENDIX 


Expected Number of Events to the First Blocked Call 


The purpose of this appendix is to demonstrate that if the index of 
performance is changed to one which attaches greater importance (than 
does Pr{bl}) to blocked calls occurring soon after the system is started, 
then no unblocked call should ever be rejected. This result can be ob- 
tained for various indices of performance; we obtain it for the expected 
number of events occurring until the first blocked call. This choice of 
index of performance has a natural heuristic justification: it corresponds 
to trying to put off the undesirable event (a blocked call) as long as possible. 
(Time is being measured here in discrete units, by counting events.) 

As before we use ¢ and y for policies, but here we limit them to rejection 
policies, or policies for the acceptance or rejection of unblocked calls. 
We may think of ¢ as a binary function of c,z with c € x and c not 
blocked in x, and interpret g(c,x) = 1 as acceptance, and ¢g(c,x) = 0 as 
rejection. A general routing policy, such as described by a fixed routing 
matrix R, will be said to be within ¢ if it accepts and/or rejects the same 
calls in the same states. 

We first introduce the quantities 


E,(g) = Yixpected number of events until the first blocked or 
rejected call under a routing policy optimal within the 
rejection policy ¢, starting in x.T 


These satisfy the equations 


_ |a| + dAs(2) r : 
BOP aa ee 2. mae Bale) 


c not blocked in x 
ge (c,z) = 1 


1 
[e| + Nar bey Powe). 


Our object will be to pick the best rejection policy, i.e., to choose ¢ so 
as to achieve 


+ 


max E,(¢). 
g 


We next define, for each fixed routing matrix R 


E,(R) = Expected number of events until the first blocked or 
rejected call, starting in x and using the policy R. 
+ The word ‘optimal’ here refers, naturally, to the fact that the (not necessarily 


stationary) policy followed makes the expected number of events to the first call 
(rather than Pr {bl}, or some other index) a maximum. 
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For fixed g, let R* = R*(v) = (rz) be a routing matrix with the prop- 
erty 


1 if c€ x suchthat y € Ac, o(cz) = 1, 
and H,(g) = max E,(¢), 


lay = ZE Ace 


0 otherwise. 


It is clear that at least one such R* exists, that it is within ¢, and that it 
defines a stationary policy for which 


E(R*) = Ey). 
We now partially order all rejection policies thus: 
g2vwv ifandonlyif g(¢7) =wW(cezr) for c€x notblockedinz. 


Let ® be the set of rejection policies. The principal result is that #(-) is 
tsotone on the partial ordering = of ®, expressed in 


Theorem 16: » = y implies E(~) = EW). 
Proof: For | S |-vectors v define the transformations T,, ¢ € ® by 


rN 1 
Tv). = —— max v a Uh 
ie) |x| + Aas Pe a eh Ness ° 
c aoen ieee in x 
With 
_ |e] + As(zx) 
bs Le | + raz ’ 


the equation for HE (¢) becomes 
Ee) =b+ T,E(). 
It is evident that if v = 0 and g = y, then 
Ty 2 Ty. 
Furthermore, each T,, 9 € ®, is a monotone transformation in that, 
v=w implies Tv = Tow. 
Hence, v = w 2 0,¢ = w imply 
b+7Tv2b+ Tw. 
For g = y, then, consider the rectangular parallelopiped 
P={v: OSvSE)}. 
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For v € 0 we have 
L(g) = b+ T,E@) 2 b+ Ty, 


so that Ty : © — @. It is obvious that @ is closed and that 7’, is continu- 
ous. Hence, by Brouwer’s fixed point theorem there is av € @ satisfying 


=b+ T yv. 


We next show that v is actually the unique solution of this equation, 
so that v = H(W) S EQ). Introduce the norm ||» || = max v,. The 
2€8 


case in which the network under study js nonblocking and y rejects no 
calls is trivial. Assume then that there exists a state x and a calle € x 
such that either c is blocked in x or c is not blocked in x and is rejected 
by wy. This implies that the “matrix” part of Ty is strictly substochastic, 
and hence that for some n 


| Ty" I<. 


Thus, v = EW). 
It is an immediate consequence of Theorem 16 that if y* (cz) = 1 for 
c € x not blocked in 2, then 


E(y") = max E@). 
vER 
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Random Tropospheric Angle Errors in 
Microwave Observations of the 


Early Bird Satellite 


By J. H. W. UNGER 
(Manuscript received June 16, 1966) 


A simplified analytical model of tropospheric random variations in angle 
measurements is described. This model is used to predict the minimum and 
maximum power density spectra between which the tropospheric random 
angle errors of observations on the Early Bird satellite are expected to lie. 

The apparent angular position of the Early Bird satellite was then meas- 
ured at microwave frequencies with the large horn-reflector antenna at the 
AT&T station near Andover, Maine. Random variations in the azimuth 
and elevation angles have been observed and recorded. The analysis of these 
records results in a description of the observed random angle variations by 
their power density spectra. 

A comparison of the predicted power density spectra from the model with 
the observed spectra 1s made. It is concluded that the observed random angle 
variations are indeed caused by random tropospheric refraction. 

The feasibility of acquiring data on atmospheric propagation effects, 
particularly tropospheric angle errors, with the aid of geo-stationary satellites 
ts therefore also demonstrated. 


I. INTRODUCTION 


1.1 Objective of this Paper 


The performance of earth-based radar and optical systems is ulti- 
mately limited by temporal and spatial random variations in the refrac- 
tive index of the tropospheric propagation medium. It is the objective 
of this paper to present a method for predicting random tropospheric 
angle errors in such systems, and to compare a prediction with micro- 
wave observations made on the Early Bird satellite. 


1439 


1440 ‘THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1966 


1.2 Problem Approach 


The scintillation or twinkling of the stars which is experienced in ob- 
servations through the earth’s troposphere is a familiar effect of the 
random variations in the refractive index of this propagation medium. 
Astronomers have known for a long time that the troposphere actually 
causes variations in at least four characteristics of the received star light, 
namely: (2) the intensity, (2) the spectral distribution of the intensity, 
(it) the shape of the telescopic diffraction image, and (v) the apparent 
angular position of the star. Scientific studies!:? of these effects seem to 
concentrate mainly on the intensity scintillations. The random varia- 
tions in the other characteristics, especially in the apparent angular posi- 
tions of stars, are treated in much less detail. 

However, in those radar and optical systems which are used to measure 
the position (and its time derivatives) of both distant and near objects 
(such as aerospace vehicles) the random tropospheric angle variations 
assume great importance. For the analysis and synthesis of these sys- 
tems, it is valuable to accumulate the knowledge on the random tropo- 
spheric errors in form of a sufficiently general model. 

Such an analytical model of tropospheric random errors in the position 
measurements and their time derivatives in radar and optical systems 
has been developed. Among other capabilities this model also permits 
the prediction of the random tropospheric angle variations for specified 
sets of tracking situations and system parameters. The prediction is 
made in terms of minimum and maximum power density spectra (PDS) 
between which the observed spectra are expected to lie. 

The choice of PDS for the characterization of the random errors is 
necessary because the relation between errors at two points in this sys- 
tem is usually a function of the error frequency (f). The resulting PDS 
further permit (z) subsequent studies of the effects of frequency de- 
pendent data processing operations (smoothing, calculation of deriva- 
tives, prediction, etc.). (¢) detailed comparison with errors from other 
sources, and (i27) application of the optimization methods described by 
H. W. Bode, C. E. Shannon, and 8. Darlington.’ Values for the more 
familiar variance (co?) or standard deviation (c) of these random errors 
at the output of these processes may then be obtained with a straight- 
forward integration of the output PDS (see Section 1.4 below). 

Within the model, the predicted PDS of the random tropospheric 
angle errors are analytically calculated by operating with certain model 
functions on a model power density spectrum which is given in the range 
coordinate. This range model PDS represents the pooled data on tropo- 
spheric random refraction. It is based upon observations of random 
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variations in the tropospheric refractive index, and in range and phase 
measurements mainly made at the National Bureau of Standards.4:*.%.7 

The successful launch of the Communications Satellite Corporation’s 
Early Bird Satellite on April 6, 1965, and its subsequent stabilization in 
an almost perfect geo-stationary orbit, provided an opportunity to test 
the model. For this purpose, azimuth and elevation angle measurements 
on the microwave beacon of the Early Bird satellite were made with the 
large horn-reflector antenna at the American Telephone and Telegraph 
(AT &T) Station near Andover, Maine. Most of this ground equipment 
was previously described in detail.919-1.12.13.14.15 A brief description of the 
Early Bird satellite may be found in Ref. 16. 

The resulting angle error measurements are particularly valuable for 
comparison with the theoretical model since they are obtained under two 
unique conditions provided by a geo-stationary satellite as a target. 
First, the propagation path goes through the entire atmosphere so that 
the observed angle errors include possible effects of high altitude turbu- 
lence, which are impossible to obtain with Earth based targets. Second, 
the angular tracking rates are negligible relative to the effective wind in 
the troposphere with which the refractive index anomalies pass through 
the propagation path. 

Thus, the analysis of the angle error measurements and the prediction 
of the expected random angle errors for the geo-stationary satellite are 
considerably simplified compared to the analysis and prediction for the 
more frequent aerospace targets (aircraft, missiles, low satellites) which 
have large apparent angular velocities and motion disturbed by forces 
unknown in the necessary detail. The effects of temporal random varia- 
tions of the refractive index in the Earth’s atmosphere on the angle 
measurements, integrated along the line-of-sight between a ground an- 
tenna and a geo-stationary satellite should be observable in an almost 
pure form. 


1.3 Scope of this Paper 


In this paper, a simplified version of the model of random tropospheric 
errors is first described, which permits the calculation of the predicted 
minimum and maximum PDS of the tropospheric angle errors for track- 
ing tasks involving one almost stationary point target and a single ob- 
server (single-site radar). 

Next, the presented model is used to calculate the numerical values of 
the predicted PDS of random tropospheric angle errors for the specific 
tracking situation of the Early Bird observations. 

The methods and specific circumstances of data acquisition for one 
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twenty minute period of observations of the Early Bird satellite from the 
AT &T ground station near Andover, Maine are then described. Another 
section is concerned with data processing and analysis; it includes the 
time series of observed azimuth and elevation angles, the calculations of 
their. power density spectra and confidence limits, and estimates of the 
manual chart reading error and of the effects of thermal receiver noise. 

‘Finally, a comparison is made between the PDS of random tropo- 
spheric angle errors predicted with the model and the PDS of the ob- 
served random angle variations. 


1.4 Scaling of Power Density Spectra 


In this paper, the random variations of the observed azimuth and ele- 
vation angles will be described by their power density spectra (PDS). 
The numerical computation of the PDS from the time series of data is 
made by the indirect method described by Blackman and Tukey.” 
However, the scaling of the PDS in this paper deviates from that of 
Blackman.and Tukey by defining the variance o? of the random error as 


é = | Pipiay (1) 


Thus, the PDS P{f} is valid only for positive frequencies, f 2 0. The 
power spectral density is 
d(a’) 
Pift = 
i= (2) 
of the variance contribution d(c?) to the random error, per unit frequency 
bandwidth, df, at the frequency, f. 





II. PREDICTION OF RANDOM TROPOSPHERIC ANGLE ERRORS 


2.1 Model Concept 


The analytical model of random tropospheric errors in radar and op- 
tical systems which has been developed permits the calculation of the 
power density spectra, and variances of range, phase, range difference, 
and angle errors, and their time derivatives from a basic pool of model 
data with the aid of certain model functions. This general model ac- 
commodates many different sets of system parameters, and is flexible 
enough to allow modification for its continuing improvement based upon 
the analysis of additional data. 

During the development of the model the usual lack of sufficient data, 
and the non-stationarity of the tropospheric refractivity field soon made 
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themselves felt. It was realized therefore, that only an approximate 
model of the real troposphere could readily be constructed, which neces- 
sarily would yield approximate predictions. However, it was found that 
this approximate model was good enough to allow the useful prediction 
of tropospheric errors in several interesting cases of tracking system 
analysis and synthesis. 

In the following part of this paper a simplified version of the general 
model of random tropospheric errors is described, which is limited to the 
prediction of the power density spectra of the random tropospheric errors 
in the angle measurements made by a single-site radar (or radio tracker) 
on an (almost) stationary target. 

The particular coordinate of radar measurements selected for the col- 
lective description of the pool of basic model data is the slant-range 
coordinate. In this approach, all available observations of random tropo- 
spheric errors are first normalized to certain model conditions, and trans- 
formed into the slant-range coordinate. An analytical power density 
spectrum (PDS) at the lower limit of these normalized and transformed 
observations is then defined as the model PDS in range, P.,.{f}, where f 
is the (error-) frequency. 

The derivation of the PDS for the random tropospheric angle errors, 
and for other than the model conditions, is then achieved by processing 
the range model PDS, P.,.{f}, with certain power gain functions, called 
the model functions. These model functions depend on such parameters 
of the tracking situation as the antenna diameter, slant-range and eleva- 
tion angle of the target, weather, and wind. 

It may be noted particularly, that in this simplified version of the 
model it is not necessary to do any explicit processing in the space do- 
main. Based upon the assumption of an isotropic, and frozen turbulence 
field of refractive index anomalies in the troposphere, all processing is 
confined to the frequency {f}-domain. 

The entire model also can be used in an inversion of the computational 
flow to yield, from new observations, additional information on the basic 
range model PDS, Pn{f}, and on the model functions. 


2.2 Assumptions and Limitations 


The simplified analytical model of tropospheric random errors in radar 
and optical systems is subject to a number of assumptions and limita- 
tions: 

(t) The model is intended to yield tropospheric errors in tracking 
tasks where one point target is directly observed within the local horizon 
along a line-of-sight (LOS) by a single observer. 
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(72) It is assumed that the random errors are small, thus the model 
functions are linear in the sense of being independent of the magnitude 
of the errors. 

This assumption is justified by the finding that the random errors in 
the observed quantities (range, angles) have relative magnitudes of only 
one part in 105, or so. 

(itz) It is assumed that the random errors are stationary during the 
calculation, or observation of one PDS. The spatial and temporal non- 
stationarities of the random tropospheric errors are only considered by 
the introduction of the “global” weather functions. Local anomalies, as 
well as diurnal and seasonal variations of the tropospheric random errors 
thus are not separated here. It is believed that more detailed knowledge 
in this respect is better obtained by direct measurements under the 
particular local circumstances of actual tracking situations. 

(w) It is assumed that the random errors due to the tropospheric 
anomalies can be treated as if they were caused by the motion of a locally 
isotropic field of ‘‘frozen’”’ turbulence through the line-of-sight with an 
effective wind speed (u) normal to the LOS. 

(v) The wavelength (A) of the transmitted electromagnetic waves is 
assumed to be small, say \ < 10 [cm], in order to avoid basic theoretical 
difficulties which are manageable only if \ <« 1, where | is the characteris- 
tic length of the tropospheric anomalies.1.? This assumption is also im- 
portant in order to avoid the effects of random propagation through the 
ionosphere. 

(vt) The size of the antenna system is small with respect to the diame- 
ter of the earth (flat earth assumption). 

(vit) The size of the antenna system is small enough to avoid the lack 
of correlation between the refractivity anomalies at large distances on 
the surface of the earth. 


2.3 Model Functions 


2.3.1 Model Power Density Spectrum in Range, P nif} 


The conditions to which the available data*’’*’ on random tropo- 
spheric range, phase, and refractive index variations are normalized are: 
(t) effective tropospheric path length = L,, = 15 [km]; 
(iz) effective wind speed normal to LOS = um = 1 [m/sec]; 
(iti) surface refractivity = Nm = 10° (nm — 1) = 313, this value is 
the U.S. average,” and nm is the equivalent refractive index; 
(zv) known effects of variations of the surface refractivity are not 
corrected during data acquisition. 
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After transformation to the selected range coordinate the model power 
density spectrum, P,,{f}, is then derived as an analytical approximation 
to the lower limit of all observations. 

The derived model PDS in range consists of five branches, which are 
linear in a log (power density) versus log (frequency) plot, namely: 


(9.6 x 107° ¢* [m?/Hz] 
for 0O<f S25 X 10° [Hz] 


1.5 X 10° f7 [m?/Hz] 
for 25X10° Sf S10 X 10° [Hz] 


Pal} = 4.7 X 107" f°? [m?/Hz] 
ave for 10 10° Sf < 1.0 X 10° [Hz] 


1.5 X 10°” f* [m’/Hz] 
for 10X10° <f < 1.0 x 10° [Hz] 


1.5 X 10° f° [m’/Hz] 
for 10X 10° Sf S @ [Hz] 


(3) 


where the frequency f is to be inserted in hertz. This PDS is plotted 
in Fig. 1. 


2.3.2 Angle Scale Function, Sa 


The PDS of random tropospheric angle (a) errors for a single antenna 
radar are obtained from the range model PDS by operating on Pn{f} 
with the angle scale function, S,. The derivation of this function is 
based upon the fact that refractivity anomalies of characteristic length 
l, or of wavenumber x, which drift through the LOS with the effective 
wind speed w,, cause random error components of frequency 


f = Un/l = KUy,/2t. (4) 


To simplify the analysis the circular antenna aperture of diameter d 
is now approximated by an interferometer system of equal angle ac- 
curacy and baseline length 


B = 0.626 d (5) 


which lies in the plane of the angle being measured. The angle measure- 
ment is thought to be indirectly obtained by a range-difference (or phase- 
difference) measurement across the effective baseline length B. The 
tropospheric refractivity anomalies disturb this range-difference measure- 
ment to an amount that depends on the characteristic length / and on the 
antenna diameter d. 
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_ Fig. 1 — Model power density spectrum, Pm{f}, of tropospheric random errors 
in the range coordinate versus error-frequency, f, in a log-log plot. P,,{f} is valid 
for the model conditions in Section 2.3.1. The tropospheric anomalies have the 
characteristic length 1. 


It is found that for relatively short characteristic lengths 
lsh = 2d (6) 
which cause the high error frequencies 
f2f = ta/h = 05 a./d (7) 


the random range errors due to these anomalies at the two ends of the ef- 
fective baseline length B are practically uncorrelated with each other. 
Thus, they cause a power density of the random error in the range-differ- 
ence measurement that is twice as large as that of the random error in a 
single range measurement. Analytically, this finding may be expressed 
with the aid of a range-difference scale function 


Sar = 2 for f = fi = 0.5 Un/d. (8) 


The tropospheric refractivity anomalies with characteristic lengths 
larger than the critical length 1, , namely 


l>h = 2d (9) 
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cause the lower error frequencies 
f Sf = Un/h = 0.5 Un/d. (10) 


In this frequency region, the induced random range errors at the two 
ends of the effective baseline B are more and more correlated as the char- 
acteristic length is increased. It is found that with respect to the range- 
difference errors across B the antenna behaves like a high pass filter with 
break frequency fi; . Analytically, the resulting reduction in the power 
density of the low frequency random range-difference errors may be ex- 
pressed by another branch of the range-difference scale function, namely 


Sar = 2(f/fi)? for f Sfi = 05 u,/d. (11) 


The multiplication of the range model PDS, P,,{f}, with the two 
branches of Sar in their respective frequency regions would result in a 
PDS for the random tropospheric range-difference errors across the base- 
line B under model conditions.’ 

The last step in the derivation of the desired angle scale functions is 
based upon the assumption of small angular deviations relative to the 
axis of the antenna system. Then the angle error a is simply related to the 
range-difference error AR and the effective baseline length B by 


oo = AR/B. (12) 


In terms of power densities, this relation permits the calculation of the 
angle scale function S, from the range-difference scale function Sgr and 
B, in general, as 


Sa = Sar/B. (13) 


The combination of (5), (8), (11), and (18) finally yields the angle 
scale functions in two branches, namely 


Sa = 20 (f/m) for OSfSh Aa 
: 1 
Sq = 5/d’ for fisfso 
The breakfrequency of the angle scale function is 
fi = 05 u,,/d, (15) 


where Um = 1 [m/sec] is the model wind speed taken normal to the LOS 
and in the plane of the angle measurement, and d is the diameter of the 
circular antenna aperture. 
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2.3.3 Aperture Smoothing Function, Ba 


The spatial smoothing on tropospheric random error components that 
are due to refractivity anomalies of small characteristic length (|< d) 
is another function of the antenna diameter, d. Here the combined effects 
of several small refractivity anomalies tend to cancel across the antenna 
aperture, hence the antenna acts like a low-pass filter in the error-fre- 
quency {f}-domain. A simplified aperture smoothing function which 
analytically represents this effect is 


$= 1 for OSfSh 
2 , (16) 
b= (f/f)? for fp <f<@| 
where 
fo = 2 Um/d (17) 


is the breakfrequency for aperture smoothing in angle measurements. 


2.3.4 Effective Path Length Function, A 


From theories of propagation through a uniformly turbulent random 
medium, as for example given by Chernov,’ it is known that the vari- 
ance of phase, range and related errors is proportional to the path length. 
This proportionality holds in both the near-field and the far-field regions 
of the “scattering” refractivity anomalies of a given characteristic length, 
l. Therefore, it is possible to account for an effective tropospheric path 
length, L, which is different from the model path length L,, , by multi- 
plying the power density with the effective path length function 


A= L/Ln for 0OSfS o. (18) 


The required effective tropospheric path length, L, may be calculated 
by integrating over the geometrical length differentials along the LOS, 
which are weighted with the square of the local average refractivity at 
the height of each layer of the atmosphere. The necessary data on the 
variation of the refractivity with height have been taken from Bean and 
Thayer.® If the height of the target is he = 10 [km] above the surface of 


the Earth, and the apparent elevation angle is H, = 3°, the effective 
tropospheric path length becomes 


L = L,/sin Ey, (19) 


where 





DL, = Fr ) (6.61 — 0.01N.)[km] (20) 
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is the effective height of the troposphere for surface refractivities 
250 = N; S 450, 


and N,,, is the model surface refractivity. 
In this case, which is relevant to many radar and optical tracking 
tasks, one thus has the effective path length function as 


_ (N/Nn)? 


A : 
Lim: sin EH 


(6.61 — 0.01N,)[km] for OSfS ~. (21) 


2.3.5 Weather Functions, W 


As stated in paragraph 2.3.1 the model power density spectrum, 
P.,{f}, 1s defined as the lower limit of the available observations normal- 
ized to the model conditions. Essentially all (say 99 percent) normalized 
observations exhibit larger errors than given by P,,. Consequently, all 
PDS directly derived from P,,.{f} for other coordinates and tracking 
situations also would only give the expected minimum errors. Since it is 
frequently desired to state more about the expected distribution of the 
derived PDS above the expected minimum level we have introduced cer- 
tain power gain functions, called weather functions, W, . These are de- 
fined as the maximum weather function Wmax{f} which covers the maxi- 
mum errors previously observed, and the median weather function 
Weal f}. On a “global” basis (actually only embracing all circumstances 
of previous observations entered into the model data), it is expected that 
50 percent of the measured PDS will lie above and below the PDS pre- 
dicted with Wmea{f}, and essentially all (say 99 percent) of the measured 
PDS will lie below the PDS predicted with Wmmax{f}. 


: 7 . . F 4 
The maximum weather function derived from available observations ' 
5,6,7 - 
ris 


6 for 0</f < 1.00 x 10° [Hz]) 


1.89 x 107 f* 
for 1.00 X 10° Sf S 2.23 x 10° [Hz] 


Winaxtf} = 20 for 2.23 X 10° sf < 1.00 X 10° [Hz]; (22) 


6.32 x 107 fo 
for 1.00 X 10° $f S$ 1.00 X 10° [Hz] 


200 for 1.00 X 10° Sf < » [Hz] 


where f is in hertz. 
The median weather function is taken as 
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Weal f} = (W maxi t} ae (23) 
Both functions are plotted in Fig. 2. Note that Wnin = 1 by definition. 


2.3.6 Effective Wind Functions, U 


The purpose of the effective wind functions, U, is to introduce other 
magnitudes of effective wind speed, we ¥ Um, into the model. The deriva- 
tion of these effective wind functions rests upon the assumption that an 
isotropic, frozen turbulence field of refractivity anomalies exists in the 
troposphere which moves through the LOS with a constant effective 
wind speed component, u., normal to the LOS and in the plane of the 
angle coordinate a. With this assumption, a given anomaly causes an 
angle error of a magnitude that is independent of wu. , and of a frequency 























-3 -2 
LOG 19 (f/[Hz]) 


Fig. 2— Maximum and median weather functions, Wmax{f} and Wmea{f}, 
respectively, versus decadic logarithm of the error frequency, f. 


that is proportional to u.. It was found that a PDS that is given in 
{f,t¢m}-Space as a sum of branches of the form 


Palfiuim) = Po (f/fo)” for fmmin Sf S fm,max (24) 
is transformed into an equivalent PDS in {f,u.}-space by the relations 
Pal {fa} = Up: Palf tim) 
for ; (25) 
Ce facie = famine a aes Ug ara 


where the effective wind function for the transformation of the power 
density is 


Up = (tm/ta)™™ (26) 


and the effective wind function for the transformation of the frequency 
regions 
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Uy = Ua/Um « (27) 


In these relations the fa,min and fa,max are the limits of the frequency 
region in which P,” is valid after the transformation to {f,uo}-space. 

In the special case of small angular velocity of the LOS, the equivalent 
wind due to the angular rate is negligible compared to the natural winds 
in the atmosphere. The effective wind speed is then simply 


tas Wa; (28 ) 


where wz is that component of the natural wind which is normal to the 
LOS and in the plane of the angle a. In this plane of the angle a, the 
atmospheric refractivity anomalies, on the average, appear to move 
through the LOS with the speed wz . 

The calculation of the effective wind speed for azimuth (a > A) 
angle errors depends on the geometrical relations between the LOS and 
the natural average wind vector, #, Fig. 3. If A is the azimuth angle of 
the LOS, and 6 is the azimuth of the wind vector, their difference 


=6-A (29) 


can be used to calculate the effective wind speed for azimuth angle er- 
rors 


ta = |wa| = | @|-| sin Bl. (30) 
With the horizontal LOS component (see Fig. 3) 
wi = | w&|- cos B (31) 
the effective wind speed for elevation errors similarly becomes, Fig. 4, 
Uz = | we| = |w:|-| sin £, | (32) 
or 
Uz = | |-| cos B-sin £&, |. (33) 


Only the magnitudes of the effective wind speeds are of interest in this 
special case of small angular velocity of the LOS, and a single-site radar. 


2.4 Computation of the Predicted PDS 


In the prediction of the PDS of random tropospheric angle errors for a 
particular tracking situation numerical values are inserted for all inde- 
pendent parameters in the model functions given above. The range model 
PDS, P.{f}, is then multiplied by the model functions, within the limits 
of the stated frequency regions, in the following sequence: angle scale 
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Fig. 3 — Horizontal projection of LOS, wind vector w, and azimuth angle A. 


function, aperture smoothing function, effective path length function, 
weather function, and effective wind functions. 

For the tracking situation of the Early Bird observations on May 7, 
1965 the numerical values for the parameters and model functions are 
given in the Appendix. The operation of these models functions on the 
range model PDS, P.,.{f}, resulted in four PDS, namely a predicted mini- 
mum spectrum and a predicted maximum spectrum for each of the two 
angle coordinates azimuth and elevation. The resulting PDS Pa min, 
Pamax 80d Pgmin, Px.max are plotted over the interesting frequency 
range in igs. 12, and 18. 


III. OBSERVATIONS OF RANDOM TROPOSPHERIC ANGLE ERRORS ON THE 
EARLY BIRD SATELLITE 


3.1 Data Acquisition 


3.1.1 Method and Equipment 


For the acquisition of the data on random angle variations in azimuth 
and elevation, the apparent position of the microwave beacon (frequency 
about 4000 MHz) of the Early Bird satellite was measured with the horn 


TARGET 
oe 







OBSERVE 


Fig. 4 — Projection of wind vector into vertical plane through LOS with 
apparent elevation angle EH, . 
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antenna (aperture diameter d = 67.7 ft) and its associated equipment. 
During these measurements the communications carrier of the satellite 
was switched off; this resulted in an increase in beacon signal strength 
to such a level that the thermal receiver noise in the obtained angle 
measurements was negligible compared to the desired tropospheric 
random errors. 

The signal flow through the major pieces of equipment which were 
used is illustrated in Fig. 5. After acquisition of the Early Bird satellite 
beacon in the main beam (beamwidth @ = 0.225 deg) of the horn an- 
tenna, the antenna control was turned over to the vernier autotrack 
system, and the servo loop opened by switching off the hydraulic drive 
motors. The antenna was now fixed in an orientation indicated by the 
digital display of the azimuth (A) and elevation (/) angles given in 
degrees, and derived from digital data pickoff units, which have a pre- 
cision of encoding! of 0.00275[deg]. 

The satellite now appeared to drift through the fixed horn antenna 
beam in an irregular motion, which was partially due to motion in its 
true position (orbit), but also due to the refractive index variations in 
the intervening atmospheric propagation medium, and possibly other 
disturbances. The apparent angular position of the satellite relative to 
the electrical axis of the horn antenna on the ground was determined by 
the autotrack system which contains angle error sensing and processing 
equipment. The azimuth and elevation error signals, AA{t} and AF {t} 
from the autotrack system were passed through low-pass recording filters 
before recording either by oscilloscope and camera, or by analog strip 
chart recorder. 

The photographic pictures of the oscilloscope display giving AA vs 
AE were only used for inspection. The strip chart recordings giving the 
AA {t}, AH {t} time series, however, were used for the more detailed analy- 
sis of the data, as described later. 


3.1.2 Propagation Path and Mean Satellite Motion 


During the measurements the propagation path pointed from the horn 
antenna near Andover, Maine, to the Early Bird satellite approximately 
at an azimuth angle A % 128.5° (southeast), and an elevation angle 
EK w& 24.5°. The slant range between ground antenna and satellite was 
about 24,300 [statute miles] ~ 39,100 [km]. The terrain surrounding the 
Earth Station may be described as a shallow bowl of perhaps 10-miles 
diameter surrounded by hills of up to about 3.5 [deg] elevation. 

The mean apparent satellite motion with respect to the azimuth and 
elevation angles given above consisted of (2) a small linear drift with an 
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Fig. 5 — Flow diagram of data acquisition. 


azimuth component of A: = —2.08 X 10-* [deg/hr], and an elevation 
component of Hy; = —1.22 X 10- [deg/hr], plus (22) a diurnal elliptical 
motion with peak-to-peak amplitudes of A = 0.266 [deg] in azimuth and 
E = 0.245 [deg] in elevation. The net result of these components appears 
at the Earth Station as a slow motion of the satellite along a helical path 
seen under an oblique angle. This picture of the mean apparent satellite 
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motion was obtained by plotting the hourly azimuth and elevation angles 
from the digital display for a few days before the analyzed random angle 
error data were recorded. The random azimuth and elevation angle er- 
rors, AA and AF, which are subjects of this paper are superimposed on 
this mean apparent motion. 


3.1.3 Date and Time of Observations 


The random angle error data recorded on strip charts, and analyzed 
in this paper were taken on May 7, 1965 between about 23 hr:38 min 
EDT and 28 hr:59 min EDT. 


3.1.4 Weather Conditions 


Weather data at the horn antenna of the Andover Earth Station were 
not taken. However, the weather data may be estimated from those 
taken at a private station in nearby Rumford, Me. This estimation yields 
the following data:!8 Cloud cover 9/10, wind South 19 [statute miles/ 
hr], dry bulb temperature 41.8 [°F], dew point 35 [°F], and pressure 
28.5 [inches] = 965.0 [millibars]. 


3.1.5 Recording Filters 


The low-pass recording filters mentioned in Section 3.1.1 above were 
simple two-section RC filters. Since the source impedances feeding these 
filters are small, and the load impedances connected to their outputs are 
large compared to the resistances in the RC sections of the filters, their 
inverse power gain is 


F= Qa 1t (wD)? (oD)? + 7). 


In this equation, F is the ratio of input power to output power, 
T = RC is the time constant of one filter section, and w = 2zf, where f 
is the frequency. 

The power density of the random angle errors before the filters may 
then be obtained by multiplying the power density of the recorded 
random angle errors with the inverse power gain F. The filters which 
were used in these observations allowed a choice between two cutoff 
frequencies. The results of numerical calculations of the inverse power 
gains versus frequency for the “LOW”, and “HIGH?” filters are plotted 
in Fig. 6. 


3.1.6 Calibration 


The sensitivities of the recorded error voltages (after the recording 
filters) to errors in the azimuth and elevation angles with respect to the 
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Fig. 6 — Inverse power gain, F, of recording filter versus frequency, f. Fi 
for filter in “LOW” range. F? for filter in ‘““HIGIL” range. 


electrical axis of the horn antenna were obtained by direct calibration 
on the Early Bird satellite. For this purpose, the antenna servo system 
was disabled, and manual angle offsets were then inserted and their 
effects on the strip chart records were measured. 


3.1.7 Oscilloscope Displays 


Photographs of oscilloscope displays of the random elevation error 
(AE) versus the simultaneously occurring random azimuth error (AA) 
were also made. 

The photo tracing in Fig. 7 was obtained at 22:30 EDT May 7, 1965 
while the recording filters were in the “HIGH” range, and the exposure 
time was five seconds. It is obvious that in this sample of the higher fre- 
quency errors the peak-to-peak azimuth variations (about 40 micro- 
radians) are considerably larger than those of the elevation errors (20 
microradians). 

The photo tracing in Fig. 8 was taken at 22:38 EDT on the same date 
with the recording filters in the “LOW” range. The exposure time was 
two minutes. In this sample of the lower frequency errors the peak-to- 
peak azimuth variations (12 microradians) are slightly smaller than the 
elevation variations (18 microradians). 

As mentioned before, these photos were only used for inspection and 
not for numerical analysis. 
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Fig. 7 — Tracing of oscilloscope photograph of random elevation error (AZ) 
versus random azimuth error (AA). Time is the parameter. Recording filters in 
“HIGH” range. Exposure time: five seconds. 


3.2 Data Processing and Analysis 


3.2.1 General Methods and Equipment 


The data on azimuth (AA) and elevation angles (AF) versus time were 
recorded on a strip chart recorder with the recording filters in the “LOW” 
range. The time series of azimuth and elevation angles were manually 
digitized at two second intervals. 

After the manual digitizing process the time series of azimuth and 
elevation variations were punched into cards for subsequent processing 
on the 7094 digital computer. 


3.2.2 Time Series of Observed Angle Variations 


The time series of the azimuth (AA) and elevation (AZ) angle varia- 
tions are shown in Figs. 9 and 10, respectively. 

The total observation time was somewhat above twenty minutes. 
This observation time was limited by the mean apparent drift of the 
satellite in the fixed antenna beam. This drift resulted in the recording 
traces going off scale after a certain time. 

A total of 720 azimuth data points, and 666 elevation data points were 


1458 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1966 


AE IN MICRORADIANS 





to) 5 10 12 15 20 
AA IN MICRORADIANS 


Fig. 8 — Tracing of oscilloscope photograph of random elevation error (AZ) 
versus random azimuth error (AA). Time is the parameter. Recording filters in 
“LOW” range. Exposure time: two minutes. 


recorded. Due to the systematic drift the elevation record went off the 
recording scale sooner than the azimuth record. 


3.2.3 Power Density Spectra of Observed Angle Variations 


The random variations of the observed azimuth and elevation angles 
will also be described by their power density spectra (PDS) for compari- 
son with the predictions. The numerical computation of the PDS from 
the time series of data is made on a digital computer by the indirect 
method described by Blackman and Tukey.” It proceeded in the follow- 
ing steps: calculation and removal of the mean, and of the linear trend 
in the series; tapering the first 5 percent (start) and the last 5 percent 
(end) of the time series with a cosine function; computation of the auto- 
correlation function versus number r of 0 S r S M&M time lags each of 
duration of the sampling period At; computation of the Fourier transform 
of the autocorrelation function by.a cosine series resulting in a raw power 
spectrum and subsequent smoothing of the raw spectrum by sliding, 
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Fig. 9 — Time series of observed azimuth angles, AA, versus time, t. 


weighted averages of values for three neighboring frequency steps with 
weights 0.25, 0.50, and 0.25. 

The computer program actually calculates a quantity X’{f}, called 
“power spectrum”’, which is related to the usual power density spectrum 
P’{f} by the equation 


P'fl = X'tfl/Af, (34) 
where 
Af = fx/M = 1/(2-At-d1) (85) 


and 

fv = Nyquist frequency 

M = maximum number of lags in autocorrelation 

At = sampling period. 

In this equation, the primed quantities indicate that they still refer to 
the data at the output side of the recording filter. In order to obtain the 
desired power density spectrum at the input of the recording filter, 
P’{f} must be multiplied by the inverse power gain of the filter, F{f}, 
yielding 


Plf} = 2-At-M-F{f} Xf}. (36) 


Additional smoothing of the power density spectrum is used at the 
higher error-frequencies, since many cycles of these angle error compo- 
nents have been observed. This is done with a filter of approximately 
constant relative bandwidth, 8 = b/f = 0.231, at the expense of absolute 
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Fig. 10 — Time series of observed elevation angles, AH, versus time, t. 


frequency resolution. This point will be illuminated again in Section 
3.2.4. 

The data on azimuth and elevation angle variations given in Section 
3.2.2 were analyzed with the methods just described. It was found that 
the mean linear trends during these observations were in azimuth 
+-8.1 X 107 [rad/sec], and in elevation —1.7 X 107’ [rad/sec]. Even at 
a distance of 10 [km] along the line-of-sight the magnitude of these angu- 
lar rates amount to beam sweeping speeds of less than 0.002 [m/sec], 
which are indeed negligible compared to natural wind speeds in the 
troposphere. 

The power density spectra of the observed azimuth and elevation 
variations at the input of the recording filters, Pa{f} and Pz{f}, which 
result from these calculations are plotted in Figs. 12 and 13, respectively. 
Other spectra also plotted in these figures are explained below. 


3.2.4 Confidence Limits for Power Density Spectra 


In computing confidence limits for the power density spectra it is 
necessary to distinguish between two error-frequency regions: the low- 
frequency region in which the absolute analyzing bandwidth of the PDS 
calculation 


b = (M-At)™ (37) 
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is constant, and the high-frequency region in which the relative analyz- 
ing bandwidth 
B = b/f (38) 
is constant. 
In the low-frequency region, with constant absolute analyzing band- 
width, b, the number of degrees of freedom in the PDS estimate is ap- 
proximately 


k = 2N/M (39) 
where JN is the number of data points observed. 
In the high-frequency region, with constant relative analyzing band- 


width, 8, the number of degrees of freedom is frequency dependent ac- 
cording to 


kk = 26fN- At. (40) 


The confidence limits for the calculated PDS of the observations can 
now be given, the lower limit being 


Pup} = PU /Bufi (41) 
and the upper limit 


Poff} = Kalf}-Pifi, (42) 


where P{f} is the calculated PDS, and Ki 2{f} are the confidence factors. 
For a confidence level of p = 95 percent the upper confidence factor 
is approximately 
2.77 1.30 
Ke = Pm a re pe (43) 


the total confidence factor (here only used as an intermediate to obtain 
Ky) 


2.40 
Ky = antilogio (GE) : (44) 


and the lower confidence factor 
Ky = Kn/Ko, (45) 


where é is the number of degrees of freedom given above. For k = 5 the 
stated analytical approximations for the confidence factors have less 
than 10 percent error. 

The resulting numerical values for the upper (K,) and lower (Ki) 
confidence factors are plotted in Fig. 11. The results of calculating the 
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Fig. 11 — Relative confidence factors, Ki and K:, for p = 95 percent con- 
fidence level versus frequency, f. 


upper confidence limits (P42, Ps), and the lower confidence limits 
(Pai, Pm) for the azimuth (P,) and elevation (Pz) spectra are plotted 
in Figs. 12 and 18. 


3.2.5 Chart Reading Error 


The errors which are introduced into the data by the manual reading 
of strip chart records (digitizing) are of the same type as quantization 
errors. The variance due to a given quantization step size (q) is known 
to be” 


og = g/12. (46) 


If it is now assumed that the quantization noise, which causes this 
variance, is sharply bandlimited white noise of constant power density 
(P, ), and with a cutoff frequency equal to the folding frequency of the 
digitized time series (fy), then one also has the variance as 


LN 
od = [Peed = Paty. (47) 


Consequently, the noise power density due to the manual chart reading 
is | 
Pd = o¢/fy = ¢-At/6. (48) 


As before, this primed power density is taken at the output side of the 
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Fig. 12 — Power density of random azimuth angle errors, P, versus frequency, 
f. Pa, Pa, Paz = observed PDS and 95 percent confidence limits; P4,min , 
Pa,max = predicted tropospheric PDS limits; P:, = thermal receiver noise; 
P,a = manual chart reading error. 


low-pass recording filter. In order to obtain the power density spectrum 
of the chart reading error referred to the input of the recording filter, it 
is necessary to multiply P, with the inverse power gain Fi{f} of the filter, 
see Section 3.1.5, which yields here for the azimuth coordinate 


Paalf} = Fiff}-Pas’ = Filf}-qa’- At/6 (49) 


and for elevation 
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Posff} = Filf}-Par = Filf}-qe"- At/6. (50) 


The effective quantization step size for azimuth was g, ~ 1.2 micro- 
radians, and for elevation qz ~ 0.88 microradians, the difference being 
due to different scale factors in the two channels. The sampling period 
as stated before was At = 2 seconds. The resulting PDS of the manual 
digitizing process, P,4 and Pz, are also plotted in Figs. 12 and 18. 
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10-4 
FREQUENCY IN HERTZ 


Fig. 13 — Power density of random elevation angle errors, P, versus frequency, 
f. Pz, Pr, Pre = observed PDS and 95 percent confidence limits; Pz,min , 
Pxrmax = predicted tropospheric PDS limits; P;, = thermal receiver noise; 
Pyz = manual chart reading error; k = number of degrees of freedom. 
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3.2.6 Thermal Angle Errors 


During observations of the tropospheric random angle errors it is 
important to keep angle errors due to thermal receiver noise at a com- 
paratively low level. The variance of thermal angle errors may be ob- 


tained” as 
4 
AC + ) 
= + (51) 


on = 
8 2 


which can be reduced for S/N > 1 to 
2 0° 


es 52 
°%  8Br (S/N) ’ 8) 
where 
6 = antenna beamwidth 
S/N = input signal-to-noise power ratio 


B = receiver bandwidth 
rt = post-detection integration time. 
In order to derive the power density (P,,) of the white thermal noise 
spectrum it is first recognized that the variance of the thermal angle er- 
ror is also 


shed °  Padf F 
eh. Tage o 
where 
fe = 1/27 (54) 


is the cut-off frequency of the post-detection low-pass filter. Equation 
(53) may be integrated with the substitution x = f/f. ; df = f. dx giving 


om = Pu-fe-are tan (f/fe) |o (55) 
or 


on = 5 fe Pu. (56) 


Combining (52), (54), and (56) then yields the desired density of the 
thermal angle noise as 
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6g" 
~ InB(S/N) 
independent of the frequency f. 

During the observations on May 7, 1965, which are analyzed in this 
report, the measured signal-to-noise ratio was (S/N) = 23 [dB] = 
200[1] while the communications carrier of the Early Bird satellite was 
switched off. (This ratio was (S/N)’ = 13 [dB] = 20 [1] due to a weaker 
beacon signal when the carrier was on.) The primed signal-to-noise 
ratios stated here are referred to a 3-kHz bandwidth. The effective noise 
bandwidth, however, is considerably lower due to the employment of a 
phase-locked tracking loop quite like the one described in Ref. 15. From 
Fig. 12 in that reference it is seen that the noise bandwidth for 


(S/N) = 23 [dB] is B = 390 [Hz], 


Pin (57) 


which further results in an effective signal-to-noise ratio 
S/N = 200 X (8,000/390) = 1,588 [1]. 


Since the antenna beamwidth was @ = 0.225° = 3.94 X 10°° [rad] the 
desired power density of the thermal receiver noise with (57) here be- 
comes Py, = 4.1 X 10°” [rad’/Hz] while the communications carrier is 
switched off, and the beacon signal is strong. This thermal noise level in 
the angle measurements was low enough to permit observation of the 
random tropospheric angle variations up to frequencies of a few 0.1 [Hz], 
see Figs. 12 and 18. 


IV. COMPARISON BETWEEN PREDICTED AND OBSERVED ANGLE ERRORS 


The comparison between predicted, and observed power density spec- 
tra of random tropospheric angle variations may now be made with the 
aid of Figs. 12 and 18 into which all relevant spectra have been entered. 
The observed spectra (P,, Px) resulted from the analysis of random 
angle error data taken on the Andover Horn to Early Bird path on May 7, 
1965 between about 23 hr:38 min EDT, and 23 hr:59 min EDT. An 
inspection of Figs. 12 and 13 shows that the PDS of the observations 
cover about two decades of frequency, namely 


2.50 X 10° [Hz] < f S 0.25 [Hz]. 


The comparison of the observed PDS (P. ; Pz) with their respective 
predicted PDS (Paimin, Pajmax ; Pe,min, Pe,max) yields almost identical 
results for the two angle oocrdinates azimuth (A) and elevation (£). 
In particular it is found that the PDS of the observed random angle 
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variations (P,;Pz), within their respective 95 percent confidence 
limits (P41, P42 ;Pm, Px2), lie almost exactly on the predicted minimum 
power spectra (P4 min 3 Pz,min) for random tropospheric angle errors. 

Thus, the observed PDS match the predicted PDS. quite well in the 
shape of their frequency dependence. The /ow level of the observed PDS 
relative to the predicted range of PDS is thought to be due to the “good 
tracking weather’ at the Andover, Mainc site and at the particular time 
of observation (a quiet night). It must be remembered here that the 
prediction is based mainly upon the NBS range and phase measure- 
ments’”'*’ which were obtained in Hawaii and Colorado. Whether the 
low level of the random tropospheric errors observed in Maine is a perma- 
nent property of the site, or a chance occurrence can be decided by the 
analysis of additional observations. 

It is also possible to compare the observed PDS of the azimuth errors 
with that of the elevation errors. It is found that the azimuth errors 
here have a higher level at frequencies f > 0.1 Hz than the elevation 
errors; this is an effect of the higher azimuth wind speed component 
(us = 6.7 m/sec versus Uz = 2.2 m/sec). Even larger differences between 
the azimuth and elevation random errors are expected when their ef- 
fective wind speed components differ by larger amounts. Such wind 
speed differences may be caused by either peculiar orientation of the 
natural wind vector relative to the line-of-sight, or also by differences in 
angular tracking rates. 

Near the high-frequency end of the covered band the observed PDS 
deviate significantly in shape from the predictions. This deviation is 
particularly evident in the steep increase of the observed elevation PDS 
above f = 0.15 [Hz]. This increase is identified as an effect of quantization 
errors in the manual digitizing of the analog strip chart records. The 
transformation of these digitizing errors to the input side of the record- 
ing filters results in the steeply rising PDS (P,4, Paz) for these fre- 
quencies. 

In the frequency band of the observations the angle errors due to 
thermal receiver noise have a PDS (Pz) which is negligible compared to 
that of the tropospheric angle errors, provided the communications 
carrier in the Early Bird satellite is turned off. 

It is also possible to integrate the predicted PDS of the random tropo- 
spheric angle errors over the entire error frequency band, and then to 
take the square root to obtain the standard deviation 


= ([" rar). 
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When these integrals are calculated for the predicted minimum and 
maximum PDS§, it is found that the standard deviations of the random 
tropospheric angle errors are expected to lie between onin & 10 [micro- 
radians] 2 [seconds of are] and omax & 65 [microradians] ~ 13 [sec- 
onds of arc]. This range of values compares quite well with Kennedy and 
Rosson’s estimate that the tropospheric angle errors lie between 20 to 50 
microradians,”” 

The standard deviation of the expected tropospheric angle fluctuations 
versus baseline length was previously calculated from NBS data on re- 
fractivity and range variations by D. K. Barton.” For the equivalent 
baseline length of the Andover horn antenna of about forty feet, Barton’s 
graph shows a standard deviation of perhaps seventy microradians, a 
value slightly above our predicted maximum. 

Some astronomical observations of random fluctuations in angular 
star positions, as quoted by Tatarski,’ show standard deviations of one 
half to one second of arc. These observations have been made under con- 
ditions quite different from those for which our predictions are valid, 
namely with visible light, in clear weather, with smaller apertures, and 
probably only over a small fraction of the entire error frequency band. 
Therefore, it is not too surprising to find that these astronomical measure- 
ments lie below our minimum prediction. 

Within the limitations of the analyzed observations, and of the de- 
scribed model it is concluded that the observed random angle variations 
are essentially due to random variations of the refractive index field in 
the troposphere. The feasibility of acquiring additional data on tropo- 
spheric angle errors with the Andover horn antenna on geo-stationary 
satellites of the Early Bird type therefore was also demonstrated. These 
data may now be obtained on a routine basis with available and operat- 
ing equipment. 

The comparison of the observations given in this paper with the pre- 
diction of random tropospheric angle errors gives some confidence in the 
described analytical model. Additional observations of random tropo- 
spheric angle errors were made with radar and optical equipment over 
other propagation paths. The comparison of these observations with the 
relevant predictions from the analytical model (not reported here) are 
also satisfactory, and have further strengthened the confidence in the 
model. 


V. SUMMARY 


Earth-based radar and optical systems which are used to measure the 
position (and its time derivatives) of both distant and near objects are 
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ultimately limited in accuracy by random angle variations caused by 
fluctuations of the tropospheric refractive index. For the analysis and 
synthesis of these systems an analytical model of the random tropo- 
spheric errors has been developed. 

With this model, the predicted minimum and maximum power density 
spectra (PDS) between which observed PDS of tropospheric errors are 
expected to lie can be analytically calculated. The calculation is per- 
formed by operating with certain model functions, which depend on the 
tracking system parameters, on a model PDS (P,,) given in the range 
coordinate. P,, has been derived from observations of random variations 
in the tropospheric refractive index, and in range and phase measure- 
ments made mainly at the National Bureau of Standards. 

A simplified analytical model of random tropospheric angle errors is 
described here, which is applicable to a tracking situation involving one 
(almost stationary) target and a single observer. This model is also used 
to predict the mmimum and maximum PDS of the random tropospheric 
azimuth and elevation angle errors for microwave observations of the 
Early Bird geo-stationary communication satellite with the large horn- 
reflector antenna at the AT &T ground station near Andover, Maine. 

The general method of data acquisition, Fig. 5, and the specific cir- 
cumstances of some actual observations on the Early Bird satellite with 
the Andover horn are then described. Microwave azimuth and elevation 
angle measurements for an observation time of about twenty minutes 
were taken on May 7, 1965, while the Early Bird satellite appeared at 
an elevation angle of about 24.5 degrees. 

The analysis of the obtained time series of azimuth and elevation 
angles results in power density spectra (P, and Pz) and associated 
confidence limits which represent the observed random angle variations, 
see also Iigs. 12 and 13. The effect of manual chart reading errors on the 
observed PDS was also studied. It was shown to consist of a steep in- 
crease in the PDS at the high frequency end. The effect of thermal 
receiver noise on the observed random angle variations was kept at a 
negligible level. 

The comparison of the predicted PDS of the random tropospheric 
angle errors for the Early Bird observations with the observed PDS leads 
to the conclusion that the observed random angle variations are indeed 
caused by the troposphere. In particular it is found that the PDS of the 
azimuth and elevation observations (P4 and Py, in Figs. 12 and 18), 
within their respective confidence limits (P41, P42 ; Px, Px2), lie almost 
exactly on the predicted minimum power density spectra (P 4, min} 
Px, min) for random tropospheric angle errors. 
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The feasibility of acquiring additional data on tropospheric propaga- 
tion effects, especially random angle errors, with the Andover horn 
antenna on geo-stationary satellites of the Early Bird type, therefore, 
was also demonstrated. 
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APPENDIX 


Numerical Calculation of Predicted PDS 


The parameters of the tracking situation during the Early Bird observa- 
tions on May 7, 1965, which permit the prediction of the tropospheric 
angle PDS with the described model are: 

transmission frequency = f; = 4137.86 [MHz] 

antenna diameter = d = 67.7 [ft] = 20.6 [m] 

beamwidth = 6 = 0.225 [deg] 

apparent elevation angle = HE, ~ 24.5 [deg] 

azimuth angle = A ® 128.5 [deg] 

altitude of horn antenna = h,; = 900 [ft] = 274 [m] 

altitude of satellite = he & 22,200 [st. mi.] © 35,700 [km] 

slant-range = Ry» & 24,300 [st. mi.] & 39,100 [km] 

wind vector: | # | = 19 [st. mi/hr]; 5 = 0° 

surface refractivity = N, = 301 

With these parameters the model functions for this tracking situation 
are calculated as follows. 

Breakfrequency of the angle scale function, (15): 


fi = 2.48 X 107 [Hz]. 
Angle scale function, (14): 
Sa = 20 (f/Hz)’(1/m’] for 0<f < 2.43 x 10° [Hz] 
Se = 1.18 X 10° [1/m’] for 2.43 x 10° [Hz] Sf S o. 


Breakfrequency for aperture smoothing, (17): 


I 
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fo = 9.71 X 10°? [Hz]. 
Aperture smoothing function, (16): 
b, = 1 for O<f < 9.71 X 10° [Hz] 
®, = 9.43 X 10° (f/Hz)” for 9.71 X 10° [Hz] Sf < o. 
Effective tropospheric path length, (19) and (20): 
L = 8.02 [km]. 
Effective path length function, (18): 
A = 0.534 for O<f< o. 
Weather functions: 
mininum: Wyin = 1 (by definition of P,) 
maximum: Wyax as per (22). 
Effective wind speed, azimuth, (30): 
ua = 14.9 [st. mi/hr] = 6.7 [m/sec]. 
Effective wind speed, elevation, (83): 
Un = 4.9 [st. mi/hr] = 2.2 [m/sec]. 


Effective wind function for transformation of the power density, azi- 
muth, (26): 


Ups = 0.1497". 


Effective wind function for transformation of the frequency regions, 
azimuth, (27): 


Usa = 6.7. 


Effective wind function for transformation of the power density, ele- 
vation, (26): 


Ure = 0.4557". 


Effective wind function for transformation of the frequency regions, 
elevation, (27): 


Usn = 2.2. 


The operation with these model functions on the range model PDS 
P.»\f} results in the following four predicted PDS of tropospheric random 
angle errors. 
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Minimum PDS in azimuth: 


7.57 & 107 (f/Hz)* [rad’/Hz] 
for 0<f < 1.68 X 107 [Hz] 


3.55 & 10° (f/Hz)™ [rad’/Hz] 
for 1.68 X 10’ < f < 6.70 X 10° [Hz] 


1.95 & 10° (f/Hz)~°” [rad’/Hz] 
for 6.70 X 10° <f < 6.70 X 10° [Hz] 


1.60 X 10 (f/Hz)™ [rad’/Hz] 


Paymin = for 6.70 X 10° < f S 1.63 X 107 [Hz] 
4.25 * 10°” (f/Hz)™ [rad’/Hz] 
for 163 X 10° <f < 6.51 X 10° [Hz] 
1.81 x 10°” (f/Hz)~ [rad?/Hz] 
for 6.51 X 107° < f S 6.70 X 10” [Hz] 
5.47 X 10° (f/Hz)® [rad’/Hz] 
for 6.70 X 107° Sf < [Hz] 


Maximum PDS in azimuth: 


4.54 x 10%" (f/Hz)™ [rad’/Hz] 
for 0<f < 168 X 10” [Hz] 


2.13 X 10° (f/Hz)™ [rad’/Hz] 
for 1.68 X 10° 


3.90 X 10° (f/Hz)°® [rad’/Hz] 
for 1.49 x 10° 


1.03 <X 10° (f/Hz)~*” [rad?/Hz] 
for 1.63 X 107 


4.38 < 10°" (f/Hz)*” [rad’/Hz] 
for 6.51 X 10° 


3.61 X 10°" (f/Hz)~ [rad’/Hz] 
for 6.70 X 107° 


1.09 X 10°? (f/Hz)~ [rad’/Hz] 
for 6.70 X 10° <f S o@ [Hz] 


IA 


f < 149 X 10 * [Hz] 


1.63 X 107 [Hz] 


IIA 


f 


IA 


P, ymax = 


6.51 X 107° [Hz] 


IIA 


f 


IA 


6.70 X 10° [Hz] 


IIA 
IA 


y 


IIA 
A 


f S$ 6.70 X 10° [Hz] 
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Minimum PDS in elevation: 


2.01 & 10%” (f/Hz)™ [rad?/Hz] ) 
for 0<f S$ 5.50 X 10° [Hz] 


3.31 X 10° (f/Hz)™ [rad’/Hz] 
for 5.50 X 10° <f S 2.20 X 10° [Hz] 


3.41 X 10°” (f/Hz) °° [rad’/Hz] 
for 2.20 x 10° 


Pe, = 11:60 X 10 (f/Hz)™ [rad’/Hz] 
Emin = for 2.20 X 10° <f < 5.35 X 10° [Hz] 


4.57 X 10°“ (f/Hz)~ [rad’/Hz] 
for 5.35 X 107° 


2.09 X 10°” (f/Hz)~” [rad’/Hz] 
for 2.14 x 107 


2.22 X 10° (f/Hz)~ [rad’/Hz] 
for 2.20 x 10” 


2.20 * 10° [Hz] 


lA 


f 


lA 
IIA IIA 


lA 
ny 
IIA 


2.14 X 107 [Hz] 


2.20 * 10° [Hz] 


IIA 
oo, 
IIA 





IA 


f = « [Hz] 


IIA 


Maximum PDS in elevation: 


1.21 x 10*” (f/Hz)™ [rad’/Hz] 
for Osf 


1.99 X 10° (f/Hz)™ [rad’/Hz] 
for 5.50 X 10~ 


6.81 X 10° (f/Hz)~°” [rad’/Hz] 
for 4.91 x 10° 


Pp _ $1.95 X 10 (f/Hz)” [rad’/Hz] 
oe for 5.35 X 10° 


8.91 X 10-° ({/Hz)™” [rad’/Hz] 
for 2.14 X 107 


4.17 X 10° (f/Hz)~ [rad’/Hz] 
for 2.20 Xx 10° Sf S 2.20 X 10” [Hz] 


4.43 X 10°° (f/Hz)® [rad’/Hz] 
for 220 X 10% <f < © [Hy] J 


5.50 X 10°° [Hz] 


lA 


f < 4.91 x 10° [Hz] 


IA 


5.35 X 10°? [Hz] 


IA 


7 


IIA 


2.14 X 10° [Hz] 


lA 
Sy 
IIA 


2.20 X 107 [Hz] 


IIA 


f 


IIA 
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On the Sensitivity of Channel Capacity 
for the Gaussian Bandlimited Channel 


By L W. SANDBERG 
(Manuscript received June 24, 1966) 


It is a classic result of Shannon that binary digits can be communicated 
with arbitrarily small error probability at any rate less than 


W loge ( + m) (bits/sec) 


over a channel with bandwidth W and additive Gaussian noise of average 
power N, using signals of average power at most P. However, in Shannon’s 
proof it 1s assumed that the input to the receiver is the sum of a linear 
combination of the bandlimited functions 


sin 2nW(t — k/2W) —0e <ti< w 


A oe 
golt — k/2W) = QW — k/2W)”’ k = 1,2,+-- 


(which are of course of doubly infinite duration) and a sample function 
from an exactly bandlimited Gaussian random process. The fact that 
go(k/2W) = 0 for all integers k ¥ 0 plays a key role in that it implies 
the total absence of intersymbol interference. 

As a result of these assumptions, there have been some objections to the 
Shannon model tn connection with the notion of rate, the fact that the re- 
ceed signals are entire functions (which are predictable for all time from 
a knowledge of their values on any interval of nonzero length) and the fact 
that it 1s not clear whether the performance of the model 1s critically depend- 
ent on the assumptions that lead to the absence of intersymbol interference. 

Since Shannon’s model and his associated ingenious arguments are 
widely known and are of great interest, from the point of view of the system 
theorist, it ts important to be able to prove an “insensitivity theorem’? to the 
effect that if the model is modified to the extent that: (2) ¢o(t) ts replaced 
by an approximating function ¢ (t) with the property that the signals are of 
average power at most P where P is approximately P, and ¢(t) = 0 for 
t < t, for some negative number t,, and (iz) the noise 1s approximately 
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bandlimited with bandwidth W, then, subject to some reasonable qualifica- 
tions, it 1s possible to transmit information, with arbitrarily high relia- 
bility, at any rate less than 


P 
l 1+}. 
W ton (1+ 5) 
We prove such a theorem in this paper. In fact, we show that if the noise 
has integrable power spectral density S(w) for which 


0< inf >> S(w+ 4rWp) 
0<ao<2rW p=—o 


and 


me, 


N 


Ic 


2V sup) =>) S(w+4rWp) < 


0O<a<2rW p=—o 
(these are very weak assumptions), then any rate 
yP 
R < W log.\ 1+ Ww 


is permissible if y ¢ (0,1) such that [with the understanding that ¢(0) = 1) 


. _ at Np \ 
en | o(k/2W) |< (1-7) (apes) 





where B is an important positive number that depends on R, (N/y), P, 
and W. 
Observe that of S(w) ts the zdeal spectral density defined by 


S(w) =s, lo | S Qn 
= Q, lw | > 2nW 


then N = N. 


I. INTRODUCTION 


It is a classic result! of Shannon that binary digits can be communi- 
cated with arbitrarily small error probability at any rate less than 


W loge ( + a) (bits/sec) (1) 


over a channel with bandwidth W and additive Gaussian noise of 
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average power N, using signals of average power at most P. There are, 
however, some unrealistic assumptions in Shannon’s argument. In 
particular, there have been some objections?:*4 to the Shannon model 
in connection with, for example, the notion of rate and the fact that the 
received signals are entire functions (which are predictable for all time 
from a knowledge of their values on any interval of nonzero length). 

The purpose of this paper is to focus attention on Shannon’s assump- 
tions! and show that they can be modified so that the end result is a 
quite detailed and informative statement concerned with a much more 
realistic model* of a communication system. 


II. REVIEW OF SHANNON’S ARGUMENT 


2.1 The Capacity of the Time-Discrete Gaussian Channel 


Shannon’s result for the bandlimited time-continuous channel follows 
directly from a result concerned with the following type of time-diserete 
channel. 

The channel receives one of M/ equally likely inputs (i.e., code words) 


every 7 seconds. Each input is a real n-vector X . (a1, X2, °°, Un) 
which satisfies 


|X |? S eT 


where | X | denotes the Euclidean norm of X and p is a positive con- 
stant independent of X. It is assumed that there exists a positive con- 
stant wu, independent of 7, such that n = 2u7 (with the understanding 
that we consider only values of 7 for which 2u7 is an integer). 

The channel output (i.e., the receiver input) corresponding to the in- 
put X is the n-vector X + Z, in which the components of the “noise 
vector” Z are independent Gaussian random variables with mean zero 
and variance 7. In its attempt to determine which of the JZ known code 
words was transmitted, the receiver may make an error, and we shall 
denote by p.: the probability that an error is made given that code word 
21s transmitted. 

It is assumed that the channel is used to transmit information in the 
following manner. Let a message source produce independent and equally 
likely binary digits at the rate R digits per second. Every 7 seconds, ft 
one of 2”” possible sequences is produced. We set M = 277 and we repre- 
sent each of the binary sequences by a particular code word. 

*Some different results concerning the significance of the Shannon bound (1) 
are proved in Ref. 4. In particular, there, for certain models, converse proposi- 


tions are established. 
{ We consider only values of T for which RT is an integer. 
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We say that a rate R is permissible if for each e > O there exists a T 
and a corresponding code such that 


MAX Pei S €. 
i 


It has been proven that the channel capacity C, the least upper bound 
of permissible rates, is given by 


C = p loge (1 + £) (bits/sec). 


It has also been proven that for R < C there exists a positive number 
8 = B(n,p,u,R) such that for each T > 0 there exists a code with the 
property that 


max pe: = exp [-BT + 0(7’)} 


2.2 The Time-Continuous Bandlimited Channel 


In order to use the ideas and results outlined above in his study of the 
time-continuous bandlimited channel, Shannon considers the model 
shown in Fig. 1, with the understanding that H represents an ideal 
low-pass filter with cut-off frequency W, and z(-) denotes a sample 
function of a bandlimited Gaussian random process with mean zero and 
power spectral density 


S(w) -*, |o| < 2aW 
=0, [w| > 2nW, 


where N is a positive constant. Clearly the average power of z(-) is NV. 

As in the time-discrete case, the message source produces R binary 
digits per sccond, so that every T seconds one of M = 2%” possible 
sequences is produced. Consider the ith such sequence. The coder and 
signal generator associates with this sequence a particular n-vector 


CODER AND : . 
MESSAGE 
SIGNAL RECEIVER 
sae Sania | cen 
| CHANNEL | 


Fig. 1 — Model of a Communication System. 
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A ‘ : 
X = (a1,%2,-°++,%n), Where n = 2WT, and a corresponding signal 


“. sin 2aW(t — k/2W) 

i) = ee ese ee 

w(t) = Dae mW kW)” 

which is transmitted. This process is repeated every 7’ seconds. It is 
assumed that 


te (—,«) 


|X? <2WwPr 


for each code word, so that, for each signal, as can readily be verified, 
7 | ula < P. (2) 


Insofar as a physical interpretation of (2) is concerned, the object 
on the left is the total energy of u(-) divided by the length of the 
interval [(4W), (4W)” + 7] which, considering only the instants 
t = k/2W, contains all of the samples of u(-) that can be made nonzero. 
If (2) holds, then Shannon says that u(-) has average power at most P. 

The received signal due to the noise and only the ith sequence is 
u(-) + 2(-), since the response of H to u(-) is u(-). The value of this 
signal at the instant t = k/2W is 


a, + 2(kK/2W) for k= 1,2,---,n 


in which the z(k/2W) are independent* Gaussian random variables 
with mean zero and variance NV. These sample values are the same as 
those that would have been obtained if we had not ignored the effect 
at the receiver of transmitted signals due to previous and subsequent 
sequences, since the values of such signals at t = k/2W vanish for 
fe SOLD, ae Mi 

Thus, on the basis of the channel capacity result of the previous sec- 
tion, we see that our continuous channel can process information, with 
arbitrarily high reliability, at any rate less than the capacity of the time- 
discrete channel with parameters » = W, p = 2WP, and y = N, that 
is, at any rate FR less than 


W loge (1 +h) ‘ 


2.3 Discussion 


The argument of the last section is based on the assumptions that the 
input to the receiver is the sum of a linear combination of the band- 


* The autocorrelation function of the noise vanishes for 7 = k/2W, k # 0. 
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limited functions 


; — k/2W) See ee 
sjjay) A Se WS EW) 
got k/2W ) onW (it ee k/2W) ’ k= i Pye 


(which are of course of doubly infinite duration) and a sample function 
from an exactly bandlimited Gaussian random process. The fact that 
go(k/2W) = 0 for all integers k # 0 plays a key role in that it implies 
the total absence of intersymbol! interference. 

As a result of these assumptions, there have been some objections 
to the Shannon model in connection with the notion of rate,* the fact 
that the received signals are entire functions (which are predictable for 
all time from a knowledge of their values on any interval on nonzero 
length), and the fact that it is not clear whether or not the performance 
of the model is critically dependent on the assumptions that lead to the 
absence of intersymbol interference. 

Since Shannon’s model and his associated ingenious arguments are 
widely known and are of great interest, from the point of view of the 
system theorist, it is important to be able to prove an “insensitivity 
theorem” to the effect that if the model is modified to the extent that: 
(t) go(€) is replaced by an approximating function ¢(¢) with the property 
that the signals are of average power at most P where P is approximately 
P, and g(t) = 0 for é¢ < t, for some negative number f, , and (7) the 
noise is approximately bandlimited with bandwidth W, then, subject 
to some reasonable qualifications, it is possible to transmit information, 
with arbitrarily high reliability, at any rate less than 


P 
1 1+ —). 
W loge ( As x) 
A quite explicit theorem of this type is stated in the next section. 


Ill. THE MORE REALISTIC MODEL 


We now consider the system of Fig. 1 to be an approximation to the 
Shannon model described in Section 2.2. 

Here we assume that 2(-) is a sample function from a Gaussian ran- 
dom process with zero mean and integrable power spectral density 
S(w) with the property that 


sup >, S(w+ 4rWp) 


O<w<2cW p=—o 


* Shannon himself has indicated® that care must be taken in the physical in- 
terpretation of the result of Section 2.2. However, he does not discuss the effect of 
intersymbol interference or the effect of the departure of the noise spectrum 
from the ideal spectrum. 
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is finite. From the engineering viewpoint, this finiteness condition is a 
very weak assumption; it is certainly satisfied if there exists a constant 
K > Osuch that S(w) < K(1 +o)” for all real w. 

We again suppose that the message source produces one of M = 277 
equally likely binary sequences every 7' seconds. We assume that there 
is a first such sequence and that the coder assigns the code word 


(%1,%2, +++, 2%) to it. After 7’ seconds, the second sequence is assigned 
the code word (2n41, @n42, *** , Yn), and so on. The integer n is equal 
to 2WT. 


The transmitted signal (i.e., the input to the channel) is assumed to 
be given by 


2n 


u(t) = = x(t — k/2W) + oan tup(t — k/2W) +... 


in which y(-) is a real-valued function of ¢ defined on (— ©,” ) such 
that there exists a negative constant ¢, with the property that y(¢) = 
for ¢ < ty. It is evident that each of the signal components (i.e., each 
sum) is associated with a particular code word, that is, with a particular 
input sequence to the coder. We note that the first signal component 
“begins” at t = ty + (2W)”, the second at ty -+ (2W)* + 7, and so on. 

The operator H in Fig. | is assumed here to be causal, linear, and 
time-invariant. Thus, the output of H is 


2n 


v(t) = pS aw(t — k/2W) + >) twe(t —k/2W) +. 


in which ¢(-) is the response of H to y(-). Since H is causal, there 
exists a negative constant ¢, such that ¢(¢) = O fort < t,. 

We assume that 9(0) = 1 and that ¢(-) belongs to Lz (i.e., is square 
integrable). We think of ¢(¢) as being close to 


A sin 27Wt 
go(t) aE 
in the sense that both || ¢ — go || (|| - || denotes the Z. norm) and 


Ds) | e(k/2W) — go(k/2W) | = pz) | o(k/2W) | 
kx40 


k0 
are small. Of course this requires that. —t, be sufficiently large.* 


* We may certainly take the view that ¥(-) and H are approximations to the 
ideal signal go and the ideal bandlimiting filter, respectively. However, the spe- 
cific nature of these approximations is not pertinent to our development. Observe, 
in fact, that it makes sense for us to assume here that H is an approximation to 
the ideal bandlimiting filter, but that y(-) is an impulse-like function. The re- 
sponse ¢(-) of H to y(-) is what we wish to focus attention on. 
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It is assumed also that 
De (retin)? S 2WPT 
k=1 


for7 = 0,1, 2, --- , so that the “‘average power” 

| . 

T J—« 

of the jth component of v(-) is bounded from above by P + ¢;, in 
which ¢; > 0 as ||~ — go || - 0. 

The receiver, which is assumed to be in possession of the code, samples 


the signal v(-) + z(-) at the instants t = k/2W, k = 1, 2, ---, to 
obtain in succession the ‘‘received n-vectors”’ 


2 


2 tetinglt — (ke + jn)/2W]} dt 





A 
Yi (v1, V2, on) (21, 22, 39 5 Ba) 


A 


Y2 (Ung) Unga, °° 5 Von) A (nis Sante, °° * > Zan) 


in which », = v(k/2W) and z = z(k/2W). These vectors are used as 
inputs to a minimum distance decoder. Thus, for example, if 


| Yi - X;| < min | Yi — X;|, 
ii 
in which {X,} denotes the set of code words, then Y, is decoded as 
X;. We denote by pi; the maximum probability, over all possible 


sequences of input code words with the jth code word X;, that Y; is 
not decoded as X;. We let 


A 
Pei = SUP Peij- 
7 


Our result (which is proved in the next section) is 


Theorem: Concerning the system described above, let 


0< inf >> Slw+4rWp) 


0<w<2rW p=—o 


and 


2W sup >, S(w+ 4rWp). 


OSu<2eW p=—o 


SENSITIVITY OF CHANNEL CAPACITY 1483 
Then any rate 
yP 
R < W loge (2 + *) (bits/sec) 


is permissible (in the sense of Section 2.1 with p.; as defined above) pro- 
vided that y € (0,1) such that 


2 lo(k/2W)| <a — 7) (8) 


where 8B = B[(N 2WP, W, R] ts the number introduced in Section 2.1. 


Remarks: Observe that if S(w) is the ideal power spectral density de- 
fined by 


Slo) = aa, |w| < 2eW 
= Q, |w| > 2rW 


then N = N. The condition that 


0< inf >> S(w+ 4rWp) 


0O<0<2eW p=—~% 


is certainly satisfied if S(w) is a reasonable approximation to the ideal 
spectrum. 
If S(w) is nonincreasing for w = 0, then for p = 1, 2, ---, 


4rWp 
sup S(w+4rWp) S : = S(w)dw 
0<0<2rW 20 onW 4rW p—27wW 
and 
1 —47W (p—l1) 
sup S(w — 4xWp) Ss —— S(w)dw. 
0<sw<2rW 2Qr 2rW —4rW pt2rW 


Thus, for S(w) nonincreasing for w 2 0, we have the bound 


N <2W sup oF S(w + 4rWp) + = : Ee S(w)dw. 
0<w<2eW p=— 

The exponent 6 has been estimated by Shannon.’ 

The basic idea of the proof of the theorem is, roughly speaking, to 
(z) treat as an additional ‘‘noise source” the departure of the samples of 
v(-) from the corresponding samples in the case of zero intersymbol- 
interference (Sublemma 1 of Section IV provides an estimate of this 
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departure), and (27) to obtain a lower bound on the channel capacity 
of the more-realistic model by comparing its error probability per- 
formance with that of a model possessing zero intersymbol-interference 
and independent Gaussian noise samples (this is done in the proof of 
Sublemma 2 of Section IV). 


IV. PROOF OF THE THEOREM 


4.1 The Discrete Channel 


Consider first a discrete channel with memory that receives one of M/ 
equally likely inputs (i.e., code words) every T seconds. As in Section 
2.1, each input is a real n-vector X which satisfies | X |? S pT’, n is 
equal to 2u7’, and each input represents a particular sequence of RT 
binary digits. Let (a1, 22, --: , 2») denote the first code word, (an41, 
Xn42,°** , Ven) the second code word, and so on. 

At time ¢ = (j — 1)T, the receiver receives the n-vector 


Ve 


in which 
y(p) = 2, te (D —k)+2p), p=1,2,- 


where here ¢(-) is a function defined on the integers so that ¢(0) = 1 
and 


X lob) | < @, 


and each z(p) is a Gaussian random variable with zero mean. For each], 
let 


4; 


and 


V5 = todd + (9 — 1)n), 2 + G — 1)n), --- , lyn}. 


where 


o(p) = a(n — b). 


Then Y; = V;+ Z;. 
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We assume that the receiver attempts to determine the jth code 
word V; by minimum distance decoding as in Section ITI. Let p.; denote 
the error probability associated with the transmission of code word 1, 
as defined in Section ITI. In Section 4.3 we prove the following result, 
which we shall exploit here, concerning this channel. 


Lemma: Let Z;, as defined above, possess the property that (with & the 
expectation operator and (-,-) denoting the usual inner product of n- 
vectors] there exist constants « and n such that for every real n-vector U of 
unit length: 


0<e S8|(U,Z,)/ <1 


uniformly inj and n. Let y € (0,1). Then any rate 
Re 1 logs (1 +22) 
2un 


zs permissible (in the sense of Section 2.1) provided that 


2 l@1< a= 9% &) 


py 
where 8B = $l (n/y), p, w, R] ts the number introduced in Section 2.1. 


4.2 Completion of the Proof of the Theorem 
& | (U,Z;) | 


& > UkUié[k+ G—-Dn]* + Gn] 


> uak[(l — k)/2W] 
AY, 
for any real n-vector U, in which 


R(r) = xl. S(w)e"dr. 














Thus, 
g | (U,Z;) ? as oa » Uxuy [ S(w) err? day 

2a 00 
1 foe) n : 2 

= | > “ee S(w)dw 
27 Jc | k=1 
1 6] 2rW+47Wp n : 2 

— + 3 I » ie S(w)dw 
2 po orw+4ewp | k=l 
= 


2 ie) 
> S(w + 4rWp)dw. 


poo 





n 

—twk/2w 
> ue ' 
k=1 
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It follows at once that 


2 
—iwk/2W 
Ke 


& (U,Z;) |’ 


IIA 


dw, 








co 
Me 
x) 


2rW 
sup a S(w + 4rWp) al 


Osw<2rW p=— 2rw 


and that 


2 
—iwk/2w 
Ue 


dw. 





td 
M: 
U 





1 2rw 
inf > S(@ + 4nWp) = al 


O<sw<2rW p=—o Q2r Ww 


& | (U,Z;) |’ 


IV 


Since 


2rWw 


AnW ap? 


> 2 
—iwk/2w 


dw = | U |’, 


Ure 








k=1 


we have 


A 


&|(U,Z;) |) $ 2W sup > S(w + 4rWp) 


~ 0<a<2rW p= 


&|(U,Z;) P= 2W inf > S(w+ 4rWp) 
0<w<2rW p=—e 
for |U| = 1, independent of 7 and n. Thus, we may view the time 
continuous system of Section III as a discrete-time communication 
system of the type described at the outset of this section with » = W, 
p = 2WP, 


e=2W inf >) S(w+4rWo), 


0Osw<2rW p=—eo 


and 


2V sup) >> S(w+4rWp). 


O<a<2reW p=—w 


0 


This proves the theorem. 


4.3 Proof of the Lemma 


With x, as defined in Section 4.1, let 


~ 


A 
V5 = {apace » 22eG—nls * °° » Lint}. 


Sublemma 1: 


IV, — Tf 0 (3 | o(k) ) 
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Proof: 


ak jn iy 2 
[VeVi = 


p=1+ (7-1) n 








two(p —k) — Xp 
i=1 


fo) 2 


S* xol(p — k) 


k=—0 


yn 


I 








p=1t+(7-l1)n 
in which a, = 0 fork < 1,¢(0) = 0, andgé(k) = ¢(k) fork ¥ 0. There- 
fore, 


|V; - V;)’ a 


Pp 





2d Xp (k) f 
and, by the Schwarz inequality, 

IVe=Vi Ss 2 2 | tre) [P+] oC) | 2 | a(k) | 

< me | ak) | x | 2%» |? % | o(k) |. 


Since 


De | Beas | Seer, 


Pp 


we have 


bo 


|V; — Vi | 


ice) 2 
< 217 (=. | ol) ) 
= 


which is the assertion of Sublemma 1. 
Therefore, with Y; and Z; as defined in Section 4.1, we have 


Y,;=V;+8;4+ Z; 
in which 
| #; |? S 2»T (2 o(k)). 
This fact when combined with the following result* proves the lemma. 


Sublemma 2: Consider a time-discrete channel of the type described in 
Section 2.1. Replace Z by the n-vector (H + Q) in which E 1s a fixed vector 
and the components of Q are Gaussian random variables with zero mean 
with the property that there exist constants « and y such that for every real 
n-vector U of unit length: 


O<e = &|(U,Q)/ = 7 
* See Ref. 3, Appendix D, for a result related to Sublemma 2. 
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uniformly in n. Let y € (0,1). Then any rate 
R <n lop(1 + 28) 
2un 
as permissible (in the sense of Section 2.1) provided that 


|E? s oT 
for all T > 0, in which 


i200 = vy 


where B = B(n/y, p, u, BR) is the number introduced in Section 2.1. 
Proof: Let To € (0, ). Consider the time-discrete channel of Section 2.1 
with noise vector Z, but with y replaced with (1/y)y. Here for 
R< plows (1 aa 30) 
2un 


and T' = To, there exists a code {X,} such that X; ¥ X; fori + j, 
and the error probability (using minimum distance decoding) given 
that the 7th code word was transmitted 


pi = Pr U [Xi + Z— Xj] S121) 
Ft 


is at most exp [— B87 + 6(T)] independent of 7, where 


B = Bl (n/y), p, », RB] 


and 6(T)/T — 0 as T > o. For this code, the error probability (using 
minimum distance decoding) for the channel described in Sublemma 2 is 


pa © Pr U (IX: + B+ Q-Xj| S| + Qh. 
Let ¢;; 2 |X; — X;|, and let U:; denote the unit-length vector 
(X; — X;)/ci;. Then it can easily be shown that 
[X:+£+Q-X;|S|£F+Q| 
if and only if 
(Ui;,Q) S — xei3 — (Ui, E), 
in which (-,-) denotes the usual inner product of n-vectors. Thus, 


Pei = Pr A {(Uij,Q) S — ei; — (Ui; E)} 
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and similarly, 


Dei = Pr ¥ (Way; 2) Ss - ge}. (3) 


Consider (8). Let the n-vector P = (p1, D2, °**, Dn) Yepresent a 
general point in Euclidean n-space &, and let ®:; denote the closed 
half-space of &, throughout which (Ui;, P) S —4e:i;. Let@;: = U @i;. 

Ai 


Then 


—n/2 n 
pei = (20)? (2) i exp | - 52 ps a | dz, +++ dan. 
Ri i= 


Similarly, let $;; denote the closed half-space throughout which 
(Uiz, P) 3 —lhe + (Uss, BY, 
and let 
8; S U 84. 
Then, since 
Dei = Pr 2 {(Uis,v°Q) S —Beu + (Us, D4, 
we have, with A the covariance matrix of the random variables {q,7“}, 
pes = (2)"” (det A) i exp [—3Q'A "Qld «++ dqn. 
Let us assume that 
[Beis + (Uiz, E)W™ = fea (4) 
for all 7 # 7. Then 8;; C ®:; , 8; € A; , and hence 
Dei S (20) ” (det A)? ie exp [—3Q'A "Qld +++ dan. 
Let Q = ZY, where & is the orthogonal matrix such that =A 2 


= diag (Ar, Ao, °**, An), With the understanding that \, and )d, 
denote the smallest and largest eigenvalues of A, respectively. Then 


- 4 he oe 
Pei S (Qa) (Mada ++ An)? i , exp |- 52s At ‘ne | dyr +++ dn 
Ri = 


in which ®; denotes the inverse image of ®; under the transformation 
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represented by %. Similarly, 


—n/2 
fer = (2r)"? (2) i? exp] 57D y | ay -+ dYn. 


Since, by assumption, 
eS 8\|(U,Q)/ s 


for every real n-vector U of unit length and every positive integer n, 
it follows that A: = ey and), S ay. We note that for0 <A; < my”: 


hj * exp | - : su? | = (*) exp | - ty? 


provided that y; = n/y. Thus, 


(Qr)"?(Made t+ An)? —_ exp |- DD | dyt +++ dYn 
a 
nk 


—n/2 n 
< (2n)"” (2) | exp | - a, Yas ‘ dy +++ dYn 
Y (t;’—@) 2 k=1 


< xe 
in which @ denotes the hypercube in &, defined by the inequalities: 
pi S n/yiorj = A peter ens 
Therefore, 


Pei sf < | a) 
oa (Ri’—C) 


e 


IIA 


ei + (Qa)? (ade +++ An)? / exp| -3 Ly Me Ye *| aus Se AY as 
, = 


However, 


On "Oe te) / exp | - dy nye | dy: +++ dyn 
e = 


dole 


nly re 
II (Qr) dy ‘yo ody 
—nly 


nr 


r; 


; 2 paly 1 
ry = (2r)" (7) i] exp (- a i’) dy. 
—nly 


IIA 


in which 


SENSITIVITY OF CHANNEL CAPACITY 1491 


Thus, 
Dei S Pei tr” S exp [—BT + 0(T)] + r°*?. (5) 


Since r < 1, the right-side of (5) approaches zero as J — . Therefore, 

to complete the proof of Sublemma 2, it suffices to show that there 

exist values of 7'y such that (4) is satisfied (for all7 # 7) forall T = 7). 
We note first that (4) is satisfied if 


— (Ui, E) $40 — es (6) 
for allj # 7. Since — (U;;, H) S | E'|, (6) is satisfied if 
|E| S40 -7')ei; (7) 


for ally ¥ 7. 


We now estimate the numbers c,;;. We have,’ with a 2 1¢:;(y/n)', 
exp [— BT + 6(T)} = Dei = Pr{(U;;,Z) s —4c:;} 
= (ony | ede, 
for any 7 and any 7 = 12, since the variance of (U:;, Z) is n/y. There- 


fore, 


exp [—eT + 6(T')] = (27)? i 6 dx = (27)? is 


-€ “(2y) ‘dy. (8) 


Let 6 > 0 be a constant, and let a(6) denote the smallest nonnegative 
number such that 


IV 


Qy)?%=e™ forall y = a(d). 


Then 


exp [-pt + (7) 2 (2x)? [exp [-(1 + a)uldy 


IV 


(Qr) (1 + 5) exp [-3(1 + d)a’] 
for a’ = 2a(8), from which it follows at once that 
a > 2(1 + 5) "eT — 201 + 8) "In [(2r)' (1 + 8)] + 0(7)} 


for a’ = 2a(6). Since exp [—8T + 6(T)] > 0 as T > «&, we see from 
(8) that for each a(6) > 0, there exists a constant 7’; > 0 such that 
a > 2a(8) forall 7 = 7’; . Thus, for each 6 > 0 there exists a T;¢ (0,« ) 
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such that 
cis = 8(1 + 8) 'y BT 
— 8(1 + 6) *y"nfln [(2r)*(1 + 8)] + 0(7)} me 
forall T = T;. 
Inequality (7) is therefore satisfied for all 7 = Ty if To > Ts and 
Gy Shee) Ia tyne 
2(1 — v')*(L + 8) "y‘nfIn [2n)' A + 8)] + O(7)} 
for all T = 7. By assumption: | # |’ < 87 for all T > 0, in which 
o <2(1 — vy)’ "nb. 
Choose 6 > 0 so that 
o< 2(L — 7')'(L + 8) nB, 
and then let 7’ ¢ [7's , ©) be so large that 
= 21 — o')( + 8) y'n6 
— 201 — APG + 8) nT {ln [ay A + 8)] + 0(7)} 


for all T = 7). Then (10) is satisfied for all 7 = 7). This completes 
the proof of Sublemma 2. 


(10) 


| 


Vv. FINAL REMARKS 


The writer is indebted to D. Hamming and L. A. Shepp for discus- 
sions concerning this work, and to J. Savage, D. Slepian, and A. Wyner 
for commenting on the draft. 
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Phase Vocoder 


By J. L. FLANAGAN and R. M. GOLDEN 
(Manuscript received July 18, 1966) 


A vocoder technique is described in which speech signals are represented 
by their short-time phase and amplitude spectra. A complete transmission 
system utilizing this approach is simulated on a digital computer. The en- 
coding method leads to an economy in transmission bandwidth and to a 
means for time compression and expansion of speech signals. 


I. INTRODUCTION 


Analysis-synthesis methods for speech transmission aim at efficient 
encoding of voice signals. A customary approach is to represent sepa- 
rately the important features of vocal excitation and tract transmis- 
sion! The well-known channel vocoder of Dudley? derives signals which 
fall into this dichotomy. The tract transmission is described by values 
of the short-time amplitude spectrum measured at discrete frequencies, 
and the excitation is described in terms of the fundamental frequency 
of the voice and the voiced-unvoiced character of the signal. Efforts to 
solve the long-standing problem of good-quality synthesis from such 
representations have centered on adequate analysis and specification of 
the excitation data. 

One advance in surmounting the difficulties connected with pitch and 
voiced-unvoiced extraction is the voice-excited vocoder (VEV).® This 
device relys on transmission of an unprocessed subband of the original 
speech to carry the excitation information. The spectral envelope infor- 
mation is transmitted as in the channel vocoder by a number of slowly- 
varying signals. Through accurate preservation of excitation details, a 
transmission of improved quality and modest bandsaving is achieved. 

The present paper proposes another technique for encoding speech to 
achieve comparable bandsaving and acceptable voice quality. In addi- 
tion, the technique provides a convenient means for compression and 
expansion of the time dimension. The method specifies the speech signal 
in terms of its short-time amplitude and phase spectra. For this reason, 
it is called phase vocoder. Like the VEV, the phase vocoder does not 


1493 


1494 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1966 


require the pitch tracking and voiced-unvoiced switching inherent in 
conventional channel vocoders. Elimination of these decision-making 
processes and the transmission of excitation information by phase- 
derivative signals contribute to improved quality in the synthesized 
signal. 


II. PRINCIPLES 


If a speech signal f(t) is passed through a parallel bank of contiguous 
band-pass filters and then recombined, the signal is not substantially 
degraded. The operation is illustrated in Fig. 1, where BP;--—-BPy 
represent the contiguous filters. The filters are assumed to have rela- 
tively flat. amplitude and linear phase characteristics in their pass bands. 
The output of the nth filter is f,(é), and the original signal is approxi- 
mated as 


{0 = ds Ialt). (1) 


Let the impulse response of the nth filter be 
Gn(t) = h(t) COS wn, (2) 


where the envelope function h(t) is normally the impulse response of a 
physically-realizable low-pass filter. Then the output of the nth filter is 
the convolution of f(¢) with gn(d), 


falt) = [FOR = d) cos fan(t — »)IAd 
(3) 
= Re | exp (Jjornt) [ron — )) exp (= joan) | : 


The latter integral is a short-time Fourier transform of the input 
signal f(é), evaluated at radian frequency w, . It is the Fourier transform 
of that part of f(¢) which is “viewed” through the sliding time aperture 


f, (t) 





F(t) Sfp (t) 


Fig. 1 — Filtering of speech by contiguous band-pass filters. 
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h(é). If we denote the complex value of this transform as F'(w, , t), its 
magnitude is the short-time amplitude spectrum | F(w, ,¢) |, and its 
angle is the short-time phase spectrum ¢(w, , t). Then 


fr(t) = Relexp (ont) (wn , t)] 


or 


jal) = | Flwn st) | 608 [ont + elon 5 t)]. (4) 


Tach f,(é) may, therefore, be described as the simultaneous amplitude 
and phase modulation of a carrier (cos w,t) by the short-time amplitude 
and phase spectra of f(t), both evaluated at frequency , . 

Experience with channel vocoders shows that the magnitude functions 
| F'(w, , t) | may be band-limited to around 20 to 30 Hz without sub- 
stantial loss of perceptually-significant detail. The phase functions 
¢(wn , t), however, are generally not bounded; hence they are unsuitable 
as transmission parameters. Their time derivatives ¢(w, ,#), on the 
other hand, are more well-behaved, and we speculate that they may be 
band-limited and used to advantage in transmission. To within an addi- 
tive constant, the phase functions can be recovered from the integrated 
(accumulated) values of the derivatives. One practical approximation 
to fr(t) is, therefore, 


fr) = | Pon, t) | c08 [ont + Son , 4], (5) 


where 
t 
lon) = [oly at 
0 


The expectation is that loss of the additive phase constant will not be 
unduly deleterious. 

Reconstruction of the original signal is accomplished by summing the 
outputs of n oscillators modulated in phase and amplitude. The oscilla- 
tors are set to the nominal frequencies w, , and they are simultaneously 
phase and amplitude modulated from band-limited versions of (wn , t) 
and | F(w, , ¢) |. The synthesis operations are diagrammed in Fig. 2. 

These analysis-synthesis operations may be viewed in an intuitively 
appealing way. The conventional channel vocoder separates vocal ex- 
citation and spectral envelope functions. The spectral envelope functions 
of the conventional vocoder are the same as those described here by 
| F(w,,¢#)|. The excitation information, however, is contained in a 
signal which specifies voice pitch and voiced-unvoiced (buzz-hiss) ex- 
citation. In the phase vocoder when the number of channels is reasonably 
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cos [ wrt + $ (wp ,t)] 





t 
| Y (Wy ,t)dt 
{e) 







P (wy jt) IF (wn ,t)| 


_ Fig. 2 — Speech synthesis based on the short-time amplitude and phase-deriva- 
tive spectra. 


large, the information about excitation is conveyed primarily by the 
(wn, t) signals.* In the present technique, and if good quality and 
natural transmission are requisites, the indications are that the g(w, , 1) 
signals may require about the same channel capacity as the spectrum- 
envelope information. This preliminary impression seems not unreason- 
able in view of our experience with voice quality in vocoders. 


III. COMPUTER SIMULATION 


We have simulated a complete phase vocoder analyzer and synthesizer 
on an IBM 7094 computer. The program, written in the BLODI-B 
language,*® provides for the processing of any digitalized input speech 
signal. Flexibility built into the program permits examination of a num- 
ber of design parameters such as number of channels, width of analyzing 
pass bands, band center frequencies, and band limitation of the phase 
and amplitude signals. 

In the analyzer, the amplitude and phase spectra are computed by 
forming the real and imaginary parts of the complex spectrum 


Fon ’ t) = A(wn ’ t) i jd(wn ’ t), 


where 
t 
aes | FIG cee 
and 


b(w, ,t) = [roone — Xd) sin »Add. (6) 


* At the other extreme, with a small number of broad analyzing channels, the 
amplitude signals contain more information about the excitation, while the ¢ 
phase signals tend to contain more information about the spectral shape. Qualita- 
tively, therefore, the number of channels determines the relative amounts of 
excitation and spectral information carried by the amplitude and phase signals. 
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Then, 
| F(en , ) | = @ + 0)’ 
and 
oon) = (BM). ) 


The computer, of course, must deal with sampled-data equivalents of 
these quantities. Transforming the real and imaginary parts of (6) into 
discrete form for programming yields 


a(an,mT) = T >> f(T) [cos wAlT|h(mT — IT) 
1=0 
ms (8) 
b(wn, nT) = T >> f(LT)[sin wlT|h(mT — IT), 
7=0 
where TJ is the sampling interval. In the present simulation, 7 = 10° 
sec. From these equations, the difference values are computed as 
Aa = alo, , (m + 1)T] — alo, , mT] 
and 
Ab = Blw, , (m + 1)T] — blo, , mT). (9) 


The magnitude function and phase derivative in discrete form, are 
computed from (8) and (9) as, 


| Flon,mT]| = (a? + b2)3 


Ag 
T 


J. (bAa — adAb) 


7 wee (10) 


[on mT] = 

Fig. 3 shows a block diagram of a single analyzer channel as realized 
in BLODI-B. Since this block of coding is required for each channel, it is 
defined as a new block type and thereafter used as though it were a 
single block. A parameter associated with the block determines the 
center frequency for each channel. The time-window analyzing filter, 
labeled h(IT), is itself a special block and can be changed simply by 
the substitution of a different block of coding.® 

In the present simulation, a sixth-order Bessel filter is used for the 
h(lT) window. Its amplitude, phase, and delay responses are plotted 
in Figs. 4(a), (b), and (ec), respectively. Its impulse and step responses 
are given in Figs. 4(d) and (e). The present simulation uses 30 channels 
(N = 80) and w, = 27n(100) rad/sec. The equivalent pass bands of the 
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bAa /z[AP(wn LT)] 
—p_ 
LOW-PASS 
MULTIPLY. EIGER 
[square 7 
1 
a(Wpn,LT) 
SPEECH 2 
INPUT a*+b |F(wn,LT)I 
—p_ 
b (@n,1T) 
1 
SQUARE he 
0,4, 2,...M 
MULTIPLY iow ees 4,2, 
FILTER 
aAb 


Fig. 3 — Programmed operations for extracting | F(wn, t) | and ¢@n, 1). 


analyzing filters overlap at their 6 dB down points and a total spectrum 
range of 50 to 3050 Hz is analyzed. 

Programmed low-pass filtering of any desired form may be applied to 
the amplitude and phase difference signals as defined by Fig. 3. Simula- 
tion of the whole system is completed by the synthesis operations for 
each channel performed according to 


fn(mT) = | F(o,,mT) | cos (oom +7) Selon iT) . (11) 
= 


Adding the outputs of the n individual channels, according to (1), pro- 
duces the synthesized speech signal. 


IV. TYPICAL RESULTS 


As part of the present simulation, identical (programmed) low-pass 
filters were applied to the | F(w, ,/T)| and (1/T)Ay( , IT) signals 
delivered by the coding block shown in Fig. 3. These low-pass filters are 
similar to the A(/T) filters except they are fourth-order Bessel designs. 
Their response characteristics are shown in Tig. 5. The cut-off frequency 
is 25 Hz, and the response is —7.6 dB down at this frequency. This 
filtering is applied to the amplitude and phase signals of all 30 channels 
in the present simulation. The total bandwidth occupancy of the system 
is therefore 1500 Hz, or a band reduction of 2:1. 


BESSEL DESIGN, 6TH ORDER (—6dB AT. 50.0 HZ) 
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Fig. 4 — h(t) analyzing function and its spectral transform used in one simula- 
tion of the phase vocoder. The function is a sixth-order Bessel filter having a —6 
dB cut-off of 50 Hz. 
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BESSEL DESIGN, 4TH ORDER (—7.6dB AT 25.0HZ) 
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After band-limitation, the phase and amplitude signals are used to 
synthesize an output according to (11). The result of processing a com- 
plete sentence through the programmed system is shown by the sound 
spectrograms in Fig. 6.* Since the signal band covered by the analysis 
and synthesis is 50 to 3050, the phase-vocoded result is seen to cut off at 
3050 Hz. In this example, the system is connected in a ‘“back-to-back’’ 
configuration, and the band-limited channel signals are not multiplexed. 

Comparison of original and synthesized spectrograms reveals that 
formant details are well preserved and pitch and voiced-unvoiced fea- 
tures are retained to perceptually significant accuracy. The quality of 
the resulting signal considerably surpasses that usually associated with 
conventional channel vocoders. 


V. MULTIPLEXING FOR TRANSMISSION 


Besides conventional multiplexing methods for transmitting the band- 
limited phase and amplitude channel signals (that is, space-frequency or 
time-division multiplex), the coding technique suggests several other 
possibilities for transmission in a practicable communication system. 
As an example, suppose a limited-bandwidth analog channel is the 
available communication link. One advantageous procedure then is 
simply to divide (or scale down) all of the phase-derivative signals by 
some number, say 2 if the available channel has only one-half the con- 
ventional voice bandwidth. A synthetic signal of one-half the original 
bandwidth is then produced by modulating carriers of w,/2 by the 
¢n/2 and | F,,| signals. The synthetic analog signal now may be trans- 
mitted over the half-bandwidth channel. 

At the receiver, restoration to the original bandwidth is accomplished 
by a second sequence of analysis and synthesis operations; namely, 
amplitude and phase analysis of the half-band signal, multiplication of 
the phase-derivative signals by a factor of 2, and modulation of wp 
carriers by the restored ¢, and reanalyzed | F, | signals. This “self- 
multiplexing” transmission is illustrated in Fig. 7. Spectrograms of the 
input signal, the half-band frequency divided signal, and the reanalyzed 
and resynthesized output are shown. It is clear that two trips through 
the process introduces measurable degradation, but the intelligibility 
and quality, particularly for high-pitched voices, remains reasonably 
good. 

In effect, the greatest number q by which the w, and ¢,’s may be 

* The input speech signal is band limited to 4000 Hz. It is sampled at 10,000 Hz 


and quantized to 12 bits. It is called into the program from a digital recording 
prepared previously. 
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Fig. 6 — Spectrograms illustrating speech transmitted by the phase vocoder (N = 30). The band-pass analy- 
sis is by sixth-order Bessel filters of 100-Hz band-width. Low-pass filtering of | F, |and ¢, is by fourth-order Bessel 
filters with 25 Hz cut-off. Male speaker A. “Should we chase those young outlaw cowboys.” 
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Fig. 7 — Self-multiplexing transmission for a bandwidth reduction of 2:1. (a) Original input; (b) Frequency-divided 
synthetic signal for analog transmission over one-half bandwidth channel; (c) Synthesized output from the reanalyzed, 
frequency-multiplied, half-band signal. Male speaker B. 
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divided is determined by how distinct the side-bands about each w,/q 
remain, and by how well each ¢,/q and | F, | may be retrieved from 
them.* Practically, the greatest number appears to be about 2 or 3 if 
transmission of acceptable quality is to be realized. 


VI. COMPRESSION AND EXPANSION OF THE TIME SCALE 


As mentioned above, a synthetic frequency-divided signal may be 
produced through division of [wnt + f ¢,dt] by some number gq. This 
signal may be essentially restored to its original spectral position by a 
time speed-up of g. Such a speed-up can be accomplished by recording 
at one speed and replaying qg-times faster. The result is that the time 
scale is compressed and the message, although spectrally correct, lasts 
1/gqth as long as the original. An example of a 2:1 frequency division 
and time speed-up is shown by the sound spectrograms in Fig. 8. This 
feature of the phase vocoder is completely parallel to the time-com- 
pression feature of the “harmonic compressor’ reported earlier.’ How- 
ever, the techniques for analysis and synthesis in the two cases are 
basically different, and the phase vocoder allows compression by non- 
integer factors. 

Time-scale expansion is likewise possible by the frequency multipli- 
cation glwnt + f ¢ndt]; that is, by recording the frequency-multipied 
synthetic signal and then replaying it at a speed q-times slower. An 
example of time-expanded speech is shown by the spectrograms in I'ig. 
9. The expansion feature provides an interesting ‘auditory microscope” 
for directing attention to the spectral properties of specific elements of 
speech sounds — such as rapidly articulated consonants. In both com- 
pression and expansion of the time scale, a perceptual limit exists, of 
course, to how greatly the time scale may be altered and still have the 
signal sound like human speech. 

An attractive feature of the phase vocoder is that the operations for 
expansion and compression of the time and frequency scales can be 
realized by simple scaling of the phase-derivative spectrum. Since the 
frequency division and multiplication factors can be non-integers, and 
can be varied with time, the phase vocoder provides an attractive tool 
for studying non-uniform alterations of the time scale.” 


* More precisely, the maximum divisor is determined by how closely 
1/q So’ endt 
represents 


So ondt. 
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Fig. 9 — Time expansion of speech by a factor of 2. Female speaker. “High altitude jets whiz past scream- 
ing.”’ (a) Original input; (b) Time-expanded output. 
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VII. FURTHER REMARKS ABOUT BAND OCCUPANCY 


The possibilities of frequency division imply that the | F,| and ¢, 
signals are, in practical effect, band-limited. As described previously, 
modest bandwidth reduction of the order of 2:1 can be accomplished by 
a simple scaling of all the ¢, signals by $3. (Overt low-pass filtering of 
the ¢, signals is not required.) Also, low-pass filtering the analyzed sig- 
nals to a total band occupancy of one-half the original bandwidth re- 
sults in relatively good speech quality upon synthesis (Fig. 6). If, how- 
ever, some further trade between band saving and speech quality is 
desired, the control signals may be low-passed more severely, with 
concomitant loss in quality. The impairment resulting from low-passing 
the ¢, signals is a comb-filtering, reverberant effect in the reconstituted 
signals. Qualitatively, low-pass filtering of the ¢, signals apparently 
restricts the rate at which pitch changes can be duplicated, and “‘nar- 
rows” the sidebands produced about each w,-carrier at the synthesizer. 

The discussion connected with (4) has pointed out that each band- 
pass signal in the phase vocoder may be considered as the simultaneous 
amplitude and phase modulation: 


fr = [Fn | cos (wnt sd Yn); 


where | /,,| and yg, are non-band limited, real-valued functions of wa 
and time. Practically, the bandwidth of f,(¢) is confined to 2W, where 
W is the cut-off frequency of the low-pass time aperture h(t). This fact 
does not, however, suggest in an explicit way the band occupancy of 
the signals | F,, | and g, . The experimental results of the present study 
indicate that each of the latter, at least for practical purposes, can be 
limited to around W/2 or less, but analytical treatment leading to ex- 
planation is difficult. Even the inverse problem, that is, calculation of 
the band occupancy of a simultaneously amplitude and phase modulated 
carrier, can only be bounded loosely.’ To apply these bounds requires a 
precise description of the | F,,| and ¢ signals. Although these param- 
eters can be measured for a given speech signal, a general mathematical 
specification is not presently available. It is easy to indicate the diffi- 
culties involved. Consider the usual model of voiced speech sounds; 
that is, a periodic pulse source, whose frequency (pitch) may change 
with time, supplying excitation to a linear, passive, time-variable net- 
work. Variation of the network transmission represents the spectral 
changes both in the vocal sound source and the vocal tract transmission. 
For an analysis in terms of narrow pass-bands (large NV), the ¢, signals 
depend primarily upon voice pitch. The | F, | signals, on the other hand, 
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depend both upon source spectrum and vocal transmission at any given 
instant. 


VIII. CONSIDERATIONS FOR DIGITAL TRANSMISSION 


Applications of the phase vocoder technique to digital transmission 
are of course obvious. Given an acceptable band-limitation of the | F,, | 
and ¢, signals, each may be sampled at its Nyquist rate, or higher, and 
quantized to an accuracy that is perceptually sufficient. At this writing, 
optimum parameters for sampling and quantizing the control signals 
have not been studied in detail. Based upon past experience, however, 
a nonuniform distribution of the pass bandwidths of the analyzing 
filters would appear advantageous. lor example, center frequencies and 
bandwidths chosen according to the Koenig scale, the mel (pitch) scale, 
or the auditory critical-band function should yield dividends.* 

All of these bandwidth tapers are characterized by widths which 
monotonically increase with frequency. In such cases, the low-pass 
filtering applied to the amplitude signals would have cut-off frequencies 
also increasing monotonically with frequency. On the other hand, the 
low-pass filters applied to the phase signals might have cut offs which 
decrease with frequency. As a result, sampling rates would increase with 
w, for amplitude signals and diminish for phase signals. In addition, 
quantization levels for all signals might be made more coarse (less nu- 
merous) with increasing channel frequency. This is indicated because 
the ability of the ear to perceive frequency and amplitude changes in 
the higher end of a complex spectrum is, in general, less acute than for 
the lower part. 

Although detailed study is yet to be made of optimum digital for- 
mats, experience in this area with related vocoder devices suggests that 
transmission at bit rates somewhat less than ten kilobits/see should be 
possible without impairment due to digitalization. This rate is several 
times less than that normally associated with comparable quality PCM 
encodings of the speech waveform. Besides the questions of design opti- 
mization and data format for digital transmission, the trade which may 
be effected between signal quality and total bit rate is also a subject for 
further investigation. 


IX. CONCLUDING COMMENTS 


Because the phase vocoder produces phase derivative signals, it pro- 


* Preliminary tests along these lines indicate that a phase vocoder with as 
few as eight non-uniform channels is capable of relatively good transmission 
(J. J. Kalsalik, unpublished work). 
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vides a particularly convenient means for multiplying or dividing the 
frequency spectrum of a broadband signal. By the same token, it is a 
convenient method for compressing or expanding the time scale of a 
signal. Frequency division of speech appears to hold potential as a com- 
munication aid for persons with hearing deficient in the high frequencies. 
Time compression shows promise for auditory ‘“speed-reading” by 
persons with impaired sight. 

Psychoacoustic and physiological studies show that the human ear 
makes a type of short-time spectral analysis of acoustic signals. This 
analysis occurs at an early level in the auditory processing; in fact, at a 
preneural level. It is also clear that the auditory system utilizes informa- 
tion corresponding to smoothed values of the short-time amplitude and 
phase spectra. The phase vocoder aims to turn these facts to advantage 
by describing speech signals in terms of band-limited values of the short- 
time amplitude and phase-derivative spectra. Indications are that band- 
limited spectral samples, occupying a bandwidth on the order of one 
half that of the original signal, preserve perceptually-significant features 
of the signal. Further band conservation can be realized, but at the 
expense of signal quality. As in many other transmission systems, a 
continuum of band conservation (or bit rate) versus signal quality exists, 
and one may choose the point of operation to suit requirements. 
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Theory of Error Rates for Digital FM 


By J. E. MAZO and J. SALZ 
(Manuscript received June 29, 1966) 


A general theory is presented for evaluating the error performance of a 
digital FM system in the presence of additive noise. The digital system con- 
sidered is a conventional one employing a voltage-controlled oscillator as the 
modulator and a limiter-discriminator followed by a low-pass filter as the 
demodulator. Because of the nonlinear nature of the demodulation process, 
no adequate analytical techniques have been available to provide a satisfac- 
tory treatment. Adopting the notion of ‘‘clicks’’ used by S. O. Rice to study 
threshold effects in analog FM systems, we have succeeded in evolving a 
theory capable of predicting performance for a wide range of applications. 
While our theory reinforces some previously derived results for binary and 
for narrow-band systems, the results obtained here are not confined to these 
situations. In particular, the inefficiency of the FM discriminator as a de- 
tector for a large number of orthogonal signals is quantitatively evaluated, as 
well as the role of the post-detection filter. Some qualitative aspects of the 
error-causing mechanisms discussed in the paper are general, but quantita- 
tive results are confined to additive Gaussian noise and large signal-to- 
noise ratios. 


I, INTRODUCTION 


Theoretical investigations of FM receivers with analog input signals 
date back to J. R. Carson and T. C. Fry,! and to M. G. Crosby.? These 
investigators and others that followed them’*” were primarily concerned 
with the signal-to-noise (S/N) transfer attainable in FM receivers and 
the determination of threshold effects. Recently 8. O. Rice,® and previ- 
ously J. Cohn,’ attacked the threshold problem in I'M receivers from a 
fresh point of view by using the notion of “clicks.’”’ It has been observed 
that when the noise at the input of an FM receiver is increased beyond 
some value, the receiver ‘‘breaks,” that is, for a given (S/N) at the in- 
put, a much poorer (S/N) at the output is measured than would be pre- 
dicted from a linearized analysis of the receiver. Before the breaking 
point, clicks are heard in the output of an audio receiver. As the input 
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noise is further increased, the clicks merge into a sputtering sound. 
Rice’s approach is to relate this breaking point with the expected num- 
ber of clicks per second at the output due to the added noise at the in- 
put. 

While in analog application the criterion of (S/N) transfer is satisfac- 
tory, in digital data transmission it does not by itself furnish an adequate 
performance criterion. Usually performance is judged in terms of error 
rates which cannot be predicted from the (S/N) transfer for nonlinear 
receivers. The error rate clearly depends on the statistical distribution of 
the output noise. In good systems, the errors are very infrequent and are 
associated with rare peak noise conditions. The statistical structure of 
the occurrence of infrequent noise peaks and the manner in which they 
cause errors in FM receivers is the main subject of this paper. Some 
previous investigations of these effects have been carried out. For ex- 
ample, Bennett and Salz® have analyzed binary FM systems, including 
the effects of distortion. They derived formulas for the error rate without 
including the post-detection filter in their model. Since the error rates 
that they obtained for a well-designed binary system were close to the 
optimum obtainable for any receiver, they were able to conclude that the 
neglect of this filter was justified. Formulas are also available®:”’ for the 
probability distribution function of the instantaneous frequency of sig- 
nal plus noise at the input to the post-detection filter for N-ary FM, but 
these equations are not very useful in predicting the performance of a 
practical FM system since the task of relating this distribution to the 
distribution at the output of the post-detection filter is apparently 
untractable. In a recent paper, Salz! considered a multilevel FM narrow- 
band digital communications system where he included the post-detec- 
tion filter in his analysis. However, the results assume that the post-de- 
tection filter did not perform significant selective processing of the 
detected signal. 

In this paper, we shall develop a general theory from which the per- 
formance of I'M receivers with arbitrary processing gain may be pre- 
dicted. We shall view the conventional FM receiver, described in Section 
II, as a device for detecting digital signals and examine its properties in 
detail. In Section III, after approximating the post-detection filter by an 
ideal integrator, we show how clicks enter the problem.* Our assumptions 
and the ensuing mathematical model of the stochastic output are also 
stated there. The following section supplies the considerable amount of 


* Cohn, Ref. (7), has also mentioned the application of the concept of clicks to 
explain errors in digital FM. Further, D. Schilling of Brooklyn Polytechnic Insti- 
tute has called to the authors’ attention that he is also investigating the relation- 
ship between clicks and error rates in FM. 


DIGITAL FM 1513 


mathematical detail needed to quantitatively substantiate the work of 
Sections V through VII. In particular, the notion of clicks will be used to 
explain the poor performance (compared to ideal) of this receiver to de- 
tect a large number of orthogonal signals. This phenomenon has also been 
mentioned by Wozencraft and Jacobs.” Another result of the present 
paper is to establish conditions under which the previous analyses reli- 
ably predict the performance of actual FM systems. The work of Refs. 8 
and 11 will be supported and it will be shown that for multilevel wide- 
band systems the post-detection filter cannot be ignored. Finally, in Sec- 
tion VIII a discussion is given to suggest circumstances under which suc- 
cessive clicks will not be independent and an instructive example is 
given showing how this renders ineffective the additional selective filter- 
ing possible at the input when the frequencies are very widely spaced. 


Il. THE DIGITAL FM SYSTEM 


A digital FM signal is readily produced by changing the frequency of an 
oscillator in response to a digital baseband signal. The voltage or current 
at the output of such an oscillator may be represented as 


S(t) = A cos + f s(t’)dt’ + a], (1) 


where A is a real amplitude, w, the angular center frequency of the oscil- 
lator, and @ is an initial phase angle. The digital information-bearing 
signal s(é) is taken to be a piece-wise constant function of time represent- 
able as a random time series of the form 


s(t) = wg YS ange — nT), (2) 


where {a,,”n = 0,1, ---} is a sequence of independent and identically 
distributed integer valued stochastic variables representing the data. 
For example, one might have a, = -1 with equal probability for binary 
systems. The function g(¢) is a rectangular pulse of unit amplitude and 7 
seconds duration and w, is a proportionality constant relating frequency 
displacement to baseband signal voltage or current. The spectral proper- 
ties of this I'M wave have been extensively analyzed in Refs. 13 and 14. 

Transmission and reception of the FM wave is accomplished as follows. 
The wave S(?) is first processed by a transmitting filter, channel noise is 
added, and the result is processed again by a receiving filter assumed to 
be the inverse of the transmitting one. The signal is then detected via 
the limiter-discriminator and filtered at baseband before being synchro- 
nously sampled at ¢ = nT (using independent timing information) to de- 
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Fig. 1 — Block diagram of a digital FM receiver. 


termine sequentially the values of {a,}. We have illustrated these opera- 
tions in block diagram form in Fig. 1. A detailed description of the blocks 
shown is given in Ref. 15. We shall state here in mathematical terms the 
assumed operation of the limiter-discriminator. Let the input to the 
limiter be written in terms of in-phase and quadrature components as 


x(t) cos we — y'(é) sin we = R(t) cos [wet + ¢(t)], (3) 
where 
Rt) = Vie'@P + OP (4) 
and 
g(t) = tan y'()/2'(d). (5) 
Then the output of the discriminator is taken to be 
dp _ x'(t)y'(t) — y'()a'(t) (6) 
dt OP +’ OP’ 


where the dots denote differentiation with respect to time. The post- 
detection filter acts upon the quantity (6). 


III. FORMULATION OF THE PROBLEM AND A MATHEMATICAL MODEL 


We approximate the low-pass filter as an ideal integrator whose im- 
pulse response is unity for 7’ seconds and zero afterward. The duration 
T’ is taken equal to the signaling time 7’ and so no intersymbol interfer- 
ence occurs at the sampling times for a wave described by (1) and (2). 
The results obtained with this particular choice of filter should be repre- 
sentative of the results one would obtain with any low-pass filter of simi- 
lar bandwidth. 

The sampled output q’ of the discriminator low-pass filter output is 
given by (7) 


,_ fa (t)y(t) — y'(t)#'(t) 
d= | Oo Fy) 7) 
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The in-phase and quadrature components occurring in (7) are now not 
those of the pure FM wave (1), but have the analogous components of 
zero mean noise added in as well. One may, by use of a rotating coordinate 
system, equally consider 


a ,_ fr ax(é)y(t) — y(t)a(t) 
a=) = ayadt = | a (8) 


where y(t) is now a zero mean quadrature noise process, while 2(¢) is an 
in-phase noise process with mean A, the amplitude of the noise-free re- 
ceived IM wave. We now proceed formally with (8), defining a quantity 


rt) = y@)/2). (9) 
Equation (8) is then rewritten as a path integral 
r(T) 
dr(t) i 
= ——__ = [dy(t). 10 
a= | pag 7 [eo (10) 


In (10) we have written dy = d(tan™ y/x), but of course we do not mean 
that ¢ is evaluated using some fixed branch of tan—! y/zx since this would 
give ¢ as a single valued function of y and x and would not allow for the 
fact that as we circle once about the origin in the xy-plane ¢ increases by 
27. The noise processes y(t) and x(t) wander about the zy-plane (see 
Tig. 2), usually staying close to their mean values but occasionally tak- 
ing large excursions and encircling the origin. Each infinitesimal portion 
of the path contributes an amount dg volts to the output and all these 
small amounts from all the small portions of the path must be added to- 
gether to form the total contribution q. It is easy to see that g depends on 
the path taken, not just on its endpoints. A simple mathematical reason 
for this is that the transformation (9) is undefined whenever x(¢) = 0. 
Further, the paths taken in the zy-plane are random, and q is therefore, 
a random variable with some probability density related to the statistics 





Fig. 2 — A possible path in the zy-plane traced by the noise from ¢ = Otot = T. 
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of r(). Unfortunately, this probability density is not determined solely 
by the elementary statistics of r(é). As will be seen, in addition to the 
elementary statistics of r(¢) the distribution of its singularities on the time 
axis enters the picture. The singularities of r(t) are determined by the 
zero-crossings of x(t). Thus, the behavior of FM receivers is intimately 
related to the structure of the zero crossings of the added noise.'® 
To see how to handle the situation, visualize the following hypotheti- 
cal state of affairs. Suppose for 0 S$ ¢ S T we have that y(t) > 0, and 
that x(t) is positive for a while, decreases once through zero at ¢ = to, 
and then remains negative. A possible plot of r(¢) versus ¢t over the time 
interval is then shown in Fig. 3. For this particular path one has 
00 r(T) 00 r(T) 
- | arn | DOE ee | a. (41) 
ro) L + 7 —- Il+?r -ol +r ro) 1+ 7? 
In (11) the straightforward interpretation of the integrals is meant. 
Evaluating the infinite integral one obtains for this path 


q=uart tan r(T) — tan r(0), 


where tan“! x means the principal value inverse tangent function, 
| tan“! x | S 2/2. In general, one has the result that 


gq = tan? r(T) — tan r(0) + n(P)z, (12) 


where tan“ x again has the principal value interpretation and n(T) is 
an integer (which may be positive, negative, or zero) which is related to 
the number of times x(¢) vanishes in the interal T and to the sign of y(é) 
when x(t) vanishes. For large signal-to-noise ratios it is clear that if 
x(t) vanishes by going to zero from the positive side that it will almost 
immediately be followed by another vanishing of x(¢) in the other direc- 
tion. If y(t) has not changed, the contribution of the “return trip” to 
n(T) will cancel the contribution from the previous crossing of the y-axis. 
On the other hand, if y(t) does change sign so as to cause an encircling 
of the origin then the contribution to n(¢) will be the same as the previous 
crossing. The net contribution to n(7’) of a number of paths is shown in 
Fig. 4. The paths which have An = +2 are immediately recognized as 
the ‘‘clicks” discussed by Rice.® The ‘clicks’ are not the only contribu- 
tion to n(T’) however. There is also a contribution because of the fact 
that at ¢ = 0, when our process begins, we may be in the middle of a 
large noise fluctuation and be over in the left-half plane. Immediately 
afterwards, at t = 0-++, we will experience a contribution of +1 to n(T); 
a similar situation may prevail at time = 7, when a possibility exists of 
stopping the process immediately after we have crossed over to the left- 
half plane. We will show later that for large signal-to-noise ratios, these 
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Fig. 3 — A possible sample function of r(t). 


end-effects may be neglected because they occur with a probability that 
is asymptotically negligible compared with the probability of a click. 

An important fact to observe before proceeding with the analysis is 
that q can be decomposed into the sum of three random variables. The 
first two random variables appearing in (12) are continuous and bounded. 
Their probability densities are related to the elementary statistics of 
x(t) and y(t). The third random variable is a discrete one, whose proba- 
bilities are determined from the probabilities of zero-crossings of «(é) 
and y(t). 

The remarks made thus far about the effect of noise on FM reception 
have been general; no assumptions have been made about the statistical 
nature of the additive disturbance. In order to obtain quantitative re- 
sults some definite assumptions are necessary. For the remainder of the 


An=+2 





Fig. 4 — Net changes An in n(T’) caused by some typical paths in the zy-plane. 
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paper we shall set ourselves the task of studying the structure of the 
probability distribution of g when the input noise statistics are those of a 
Gaussian process having a symmetric spectral density about the carrier. 
From these distributions we determine the error rates as a function of the 
pertinent system parameters. 

No attempt will be made in this paper to derive an exact probability 
density for the random variable g. This is not a mathematically tractable 
problem since it requires knowledge of the probability distribution of 
zero-crossings of random processes. This by itself has been an area of 
investigation for many years without too much success. The probability 
distribution of the zero-crossings of most elementary random processes 
is not currently known. 

In order to permit an analysis of the model two assumptions are made, 
both of which we feel are quite reasonable. These two assumptions taken 
together state that the three random variables that determine q via 
(12) are all independent. We separate this statement into two assump- 
tions because their individual justification stems from two different 
physical arguments, one having to do with bandwidth and the other 
with signal-to-noise ratio. The first assumption states that tan r(T) 
and tan“! r(0) are independent. For a flat Gaussian noise input this will 
be a good approximation if T = 1/W, where W is the input noise band- 
width. Since 7 is also the signaling interval, and the correlation func- 
tion of the input noise has its first zero at t ~ 1/W, the motivation for 
this assumption is clear. The second assumption, somewhat harder to 
justify, states that n(7) is independent of the previous two random varia- 
bles, and the clicks, which comprise n(T), are independent from one 
another. This is clearly an assumption expressing an intuitive feeling 
that the clicks occur rarely and of sufficiently short duration. In general, 
they will be rare if the signal-to-noise ratio is large, and short if the 
bandwidth satisfies W = 1/T as required above. 

These two assumptions plus the identification of crossings of the nega- 
tive x-axis by the moving point in the xy-plane (as calculated by Rice) 
with the occurrence of a click shall constitute our working model of the 
output noise. An indication of how this model must be modified if the 
input noise spectrum is not relatively flat is given in Section VIII. 


IV. THE BASIC DISTRIBUTIONS 


Let y be a Guassian variable of zero mean, variance o”, and x be another 
independent Gaussian variable of mean A, variance o?.* Then the den- 
* Recall that even though our x(t) and y(t) are not independent processes be- 


cause the noise spectrum will not be symmetrical about (w. + anwa), they are inde- 
pendent variables. 
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sity p(¢) where tan ¢ = y/x and @ has the full range of 27 is well known 
and is given by Bennett,” 


p(g) = ee” Pag i/2 cos ¢ exp (—p sin’ ) a 


-[1 + erf (~V/p cos 8)], 


where p = A?/2o°. 

One fact which is implicitly contained in (13) is the probability P, 
of finding the signal point in the left half of the zy-plane. However, an 
easier way to obtain P, is as follows: 


A exp (—p) 
a a = 1 ges SS 
P, = Pr(z < 0) = 2 etfe “75 EU ae (14) 


Equation (14) will be of use in the arguments used to discard the ‘‘end 
effects” at ¢ = 0 and ¢ = T spoken of earlier. Equation (13) also im- 
mediately yields the probability density p(y) for ¢ = tan (y/z), 
—77/2 << < w/2. Indeed, we have 


p(y) = ply) + Ble + 7) 


= 2 =e 4/2 cos y exp (—psin’¢) erf (~/p cos ¢) ) 
for |¢| S 7/2. 

Suppose ¢ and ge are two independent angles which have the density 
(15), and define an angle 6 = g; — g2, | &| S x. It will be of interest for 
us to determine the probability P, that ® exceeds some angle ¢ > 0, 
i.e., we would like to determine 


(x/2)—¢ x/2 
Po= [de |” dewledole), ¢>0. (6) 
—1/2 pate 

In general, one is unable to perform these integrations exactly, but since 
discussion has already been limited to the large S/N region, little will be 
lost if we make use of this in simplifying the evaluation of (16). The 
asymptotic evaluation is carried out in detail in the Appendix; we 
distinguish three cases:* 


Case I;0 <¢ < 7/2: 


_1_ cot @/2) exp [—2p sin? (e/2)) 

TR 87 COS — “/p 
* In (17) the symbol ‘‘~”’ is used to denote asymptotic equality; this has also 
been used in (14). Also (17a) and (17c) do not hold if ¢ gets too close to the end 


points of the appropriate interval. As a rough rule, y should not be closer than 
1/\/p radians to the end points. 


(17a) 
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Case II; ¢ = 7/2: 
Py, ~ () exp (—p). (17b) 

Case III; ¢ > x/2: 
_ 2xP L-e(l + cos? 9)] 

Inrv/ ar p/p Sing cos’ ¢” 
The most important characteristic of the result (17) is the dependence 
of the exponent on angle, since for large p the nonexponential factors 
are relatively slowly varying. 


We should remark that for very small angles (15) is well approximated 
by the Gaussian curve 


Ps (17c) 


ae) = 4/2 exp (—ee") (18) 


of zero mean and variance 1/29. The difference angle 6 would, for very 
small ®, be well approximated by the difference of two independent 
Gaussian variables, each having the density (18). The quantity P, 
calculated on this basis agrees (asymptotically) with the small angle 
approximation of (17a). 

The final item that we discuss in this section is the density of n(T), 
or rather we discuss the density of that part of n(T) that arises from the 
clicks (An = +2), ignoring An = +1 contributions. For this we need 
only take over some ideas and formulas from Rice.’ We have that 
(ignoring An = +1) 


an(T) = 2nN(T), (19) 


where N(T) is the number of clicks that occur in time 7. Following Rice, 
we assume that all clicks are independent and that those tending to in- 
crease (decrease) ¢ by 27 form a Poisson process with rate of occurrence 
N.(N_). In general, with a modulated signal, V, and N_ are not equal. 
The probability density p(z) of 2 = N(T) is then given by 


co k/2 
p(z) = exp [— (Ny + N_)T] 2a. 5(z — k) ) 
-T,(2Ty N,N_); 


as may be shown by forming the discrete convolution of the densities 
of the positive and negative clicks. In (20) 6(-) is the Dirac delta func- 
tion and J;,(u) is the modified Bessel function of integer order k, be- 
having for small » as® 


(20) 
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nw (8) pp (21) 


also 
I_x(z) = I,,(z). 


The type of modulation that we are concerned with is when the in- 
stantaneous frequency deviates by wa from the carrier* for a time T, 
7 being the signaling and processing interval. For this situation Rice 
gives for the average rates N, and N_ when the noise at the receiver 
input is Gaussian 


Ny = (Vr? + fell — erf Vp + pf?/?"] 


7 (22) 
— fa exp (—p)[1 — erf (fa p/r)]} 
and 
N_— = Ni + fa exp (—p), (23) 
wheret 
r = (1/21) (c/o) 
o? = var & = var y 
o? = var £ = var Y. (24) 
Under the assumption that fz is positive we have asymptotically 
New eon ae exp [—p(1 + f'/r") 
r]\yr 
N_~ Nz + fa exp (—p). (25) 


Thus, we see that for large p an ever greater majority of clicks occur in 
the negative direction (fz > 0) and for our purposes of computing error 
rate the clicks in the positive direction may be neglected; i.e., we shall use 


Ny~ 0 ! 
for fa > 0. (26) 
N_ ~ fa exp (—p) 


* We trust that no confusion will arise between r introduced in (24) and r(é) 
introduced in (9). 

{ The case wz = 0 corresponds to no modulation. Also, for ease of writing, we no 
longer explicitly consider the factor ay . 
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For fa < 0 the situation is reversed of course. We note that the effect of 
the clicks on a modulated carrier is to tend to make the measured fre- 
quencies appear closer to the carrier frequency than the transmitted 
frequencies. That is, confining oneself for the moment to only errors 
caused by clicks, frequencies transmitted higher (lower) than the carrier 
will be measured to be at that frequency or a lower (higher) one, when 
the noise is small. 

Since we shall use approximation (26), the distribution (20) for 
z = N(T) may be replaced by the simpler Poisson one, where the proba- 
bility of getting exactly N (negative) clicks in time T is given by* 


exp (—N_T)(N_T)*™ 


INCI! an) 


pIN(T)] = 
Also the probability of getting 17 or more clicks is, for large signal-to- 
noise ratios, approximately the probability of getting exactly M clicks. 


V. DISTRIBUTION OF OUTPUT AND PROBABILITY OF ERROR 


Equations (14), (17), (26), and (27) provide the information required 
to calculate the distribution of g, (12). In principle we simply convolve 
the continuous density of [tan~!r(7T) — tan! 7r(0)] with the discrete 
density of n(T)z. In Fig. 5, we have given a qualitative sketch of the 
result, neglecting end effects. This picture is intended to show that the 
density consists of a central lobe about the transmitted frequency ex- 
tending to -kw on each side, which is the density of [tan7 r(T) — 
tan— r(0)], plus identically shaped lobes displaced by integral multiples 
of 27 toward lower frequencies (assuming fz > 0). These displaced lobes 
are weighted by the probability of getting the appropriate number of 
clicks to effect the displacement. Thus, the lobe occupying the space 
—2nz + 7 is weighted by the probability of getting exactly n clicks in 
time 7. For n = 0 the weighting is essentially one, for large S/N. There 
are, strictly speaking, similar lobes and weightings on the opposite side 
as well, but these weights are, for large S/N, negligible compared to the 
corresponding lobe we have drawn. That is to say, the first lobe on the 
right (not shown in Fig. 5) has small probability compared to the first 
lobe on the left, but has a large probability compared to the second lobe 
on the left. Nevertheless, we have neglected to include it because we will 
generally be concerned with probabilities like Pr| | q — fal | > ¢], and 
thus corresponding weights are important. We dwell on this point be- 


* We confine ourselves to fa > 0. Exactly analogous consideration apply to fa < 
0. The case fz = 0 occurs if an odd number of frequencies are allowed. 
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Fig. 5 — Qualitative sketch of density of g’ (neglecting end effects) for fa > 0. 
The dashed lines are for reference in the text. 


cause it is conceivable that for some practical or conceptual application 
the neglect would not be justified. 

The discussion given above is still not quite correct; it is modified 
when we include end effects. The principle correction that inclusion of 
end effects will cause is to add two more side lobes, one over the interval 
[—2z7,0] and the other over the interval [0,27]. The weightings of these 
lobes certainly should not exceed the estimate given in (14), and this 
will be enough to exclude them for our purposes. 

We now apply our results to some typical calculations. Consider the 
case of narrow band* FM (defined by Afal’ < 7), where one has J 
equally spaced frequencies of separation Afz crowded into a bandwidth 
W. The probability of error for any one of the frequenciesf (not situated 
at the ends) is the area outside of the interval bounded by lines Zz and 
L3 in Fig. 5. If L. and Ls are defined by | q | = ¢ then the probability 
of error for such a frequency is, from (17a), 


P 1 cot 9/2 exp [—2p sin? ¢/2)] 

€ SS EES ee See 
V 22 V cose Vp 

where, if one assumes that the bandwidth W = JAfa, one would take 


tWT 
CS ae (29) 
Our requirement that AfaT’ < a implies J > 2 for the narrow-band 
formula to be applicable (assuming W7' = 1). Note sin? @/2) is less 
than 3, and thus the exponent in (28) is exp [—kp], where k < 1. Now 
the contribution of the clicks to P, is essentially the area A, of the 
first side lobe which is by (26) and (27) 


A, = faT exp (—p). (30) 


* Note the special sense in which the term is uesd here. 
+ The P, for.a frequency at the end is one-half the expression (28). 


(28) 
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But expression (30) is, asymptotically, exponentially small compared to 
(28). Likewise, the area due to the side lobes caused by end effects is 
exponentially small, and the probability of error for narrow-band FM 
is given by (28). The result that the clicks do not asymptotically con- 
tribute to errors in narrow-band multilevel FM lends justification to a 
previous evaluation of this type system by Salz," who considered the 
special narrow-band system with WY = 1. It is both interesting and 
gratifying that this result is in agreement with the result given in Ref. 
11. In a later paper, Salz and Koll report on experimental results which 
agree with the earlier theoretical results. 

Next, consider the asymptotic evaluation of P, for the case of orthog- 
onal signals; this case corresponds to (Awa)T’ = 7, and we assume that 
the thresholds are spaced midway between the frequencies. Thus, (for a 
frequency not on the edges) we have that the error probability is given 
by the area outside of that bounded between the lines LZ; and L,. The 
contribution from the major lobe is, from (17b), 


2 exp (—p). 
In addition, the area of the first side lobe is asymptotically comparable 
to this and is 
f al’ exp (—p), 


being weakly dependent on the frequency sent. In fact, for the nth signal 
(J = 2n) we have for orthogonal signals that 
n 


fa “5? RS eZee 


diy 


The average error rate is then, for orthogonal signals (J of them, J 
even, and equally spaced signals and thresholds), 


P. = (2) exp [—p] + (4)(J/2 + 1) exp (—p). (31) 


Equation (31) is indeed a surprising result. The first term of (81) is the 
probability of confusing the transmitted frequency with one of its 
nearest neighbors. The second term is the (average) probability of con- 
fusing it with its second nearest neighbor closest to the carrier. This is 
because the area from (—7) to (—3z7/2) is, by application of (17b), 
negligible compared to the area from (—387/2) to (—5z/2). Thus, it 
states that for the multilevel scheme considered here (a not unreason- 
able one) one is less likely to confuse a transmitted frequency with its 
nearest neighbors than one is to confuse it with a particular one of its 
second nearest neighbors. We see from (31) that the error rate from the 
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continuous part of the output is comparable to the error rate caused by 
clicks. 

As a final remark about the orthogonal system we see comparing 
(31) and (14) why end effects are neglected again. 

For a final example, consider the wide-band situation where the signals 
are loosely packed in the band; i.e., (Awa)7’ > z. Now no errors will be 
caused by the continuous part of the output; only clicks will cause errors. 
If the frequencies are widely spaced a single click may not cause an error; 
several clicks during the time interval T may be required. Thus, suppose 
that the frequencies are spaced so that the phase differences of nearest 
neighbors is (Awa) 7’ = 2nz, n being any positive integer. The probability 
of error will then be the probability of getting n (or more) clicks in time 
T, which from (26) and (27) behaves as 


(faT)" exp (—np) . (n/2)" exp (—np) 
n! a n! (32) 


z exp (—np). 





IV 


The coefficient in (32) is at least as bad as for the orthogonal case, but 
the important item is the exponent. Superficially at least it appears that 
we have gained in performance by spacing the frequencies widely, since 
the exponential has changed from e ° from the minimum orthogonal 
case (AwgT’ = 7) to e “’. One must realize, however, that one is talking 
about different p’s here. The bandwidth for the case under consideration 
is essentially 2n times the minimum orthogonal one and therefore, for 
the same signal power, the nominal value of p has decreased 2n, and one 
has in fact not gained a factor of n in the exponent. In addition to the 
bandwidth penalty, error performance has actually suffered too. 


VI. COMPARISON WITH OPTIMUM 


One can demonstrate how the FM discriminator compares with the 
optimum detector when used to detect orthogonal signals; 1.e., when 
AwaT = x. It is known that when optimum detection is used for any 
orthogonal set of signals, the (exponential part of the) error rate behaves 
as exp [— E/N], where E£ is the signal energy (assumed common to all 
J levels) and No/2 is the (two-sided) spectral density of the noise. If we 
let S denote the average signal power, write H = ST, and estimate the 
total bandwidth W for large J by W = J/(2T), we see that the ideal 
exponent becomes exp [—Jp/2]. However, we had seen that, regardless 
of the number of levels, the discriminator error rate for Awg7’ = 7 be- 
haves as exp [—p]. Thus, we have lost a factor of J in the error exponent 
by substituting discriminator detection for matched filter detection. 
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An important conclusion may immediately be drawn concerning the 
performance of conventional FM receivers or detectors of orthogonal 
signals. Our results show that the receiver is indeed inferior in perform- 
ance when compared with the optimum. This fact has been stated by 
Wozencraft and Jacobs” and the reasons are clear from our analysis. The 
FM receiver admits too much noise at its front-end which cannot be 
cleaned by the post-detection filter because of the nonlinear anomalies, 
namely the clicks. As a matter of fact, the amount of noise grows in 
direct proportion to the number of orthogonal signals, hence the inferior 
exponent.* The optimum detector is a bank of matched filters. The noise 
power at the output of each filter does not grow with the number of 
signals; it is a fixed constant determined by the bandwidth of the filter, 
which roughly needs be no broader than the symbol rate, 1/7’. 

This poor performance of conventional FM receivers when used to 
detect data might be remedied by employing an FM with feedback 
system such as described in Refs. 20 and 21. The physical argument to 
support this contention is often stated as follows. In the absence of the 
feedback loop, the IF filter must be wide enough to pass the total swing 
of the incoming signal. However, since the feedback loop tracks the in- 
coming frequency, this IF filter, whose width determines the noise vari- 
ance, could be narrowed and less noise would be admitted. 

This possibility of making use of FM with feedback to improve the 
error rate in digital systems has been suggested by Wozencraft and 
Jacobs.” Unfortunately a mathematical treatment of this difficult 
problem does not exist at present. 


VII. EFFECT OF POST-DETECTION FILTER 


In the previous sections we have discussed in detail the performance 
of an FM discriminator followed by a low-pass filter. The low-pass filter 
was approximated by an ideal integrator whose integration time was 
taken to be equal to the duration of an individual signaling interval. 
Formulas sufficient to evaluate the performance of multilevel FM with- 
out the post-detection filter have recently been developed by Mazo and 
Salz;!° comparison of the results of the present paper with Ref. 10 will 
show the influence of filtering. 

* Actually, these qualitative conclusions may be arrived at by the Gaussian 
approximation to the output noise. The reason why this works is apparent from 
(31) which gives P, for orthogonal signals. The first term of (31) is not due to clicks 
but arises from the continuous part of the output noise. This is the part that the 
Gaussian approximation would tend to duplicate. The second term of (81) is due 
to clicks and has the same behavior with regard to p. Even if one could keep p 


constant as the number of levels J increased, the factor of J in the click contribu- 
tion to (81) would still degrade performance. 
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Suppose that the angular frequency y is sent and we ask for the proba- 
bility that the observed output is less than z, where (¥ — z) > 0. It is 
shown in Ref. 10 that the probability P is essentially given by* (for 


large p) 
, — »)? 

P = exp | - esol: (33) 
Consider the situation for orthogonal signals, or in fact for any signal 
set where the frequency spacing between the individual frequencies is 
fixed. One expects the ratio c?/c? to increase as the square of the total 
input bandwidth, hence as J?, the square of the number of levels. Thus, 
for a large number of orthogonal levels the post-detection filter does 
very well in improving the error performance, changing the error rate 
from (roughly) exp (—p/J?) to exp (—p). One would certainly expect 
something like this to be true since, for a large number of levels, the noise 
bandwidth before the post-detection filter is much greater than the 
signal bandwidth at that point. 

Another qualitative effect of the post-detection filter may be noted. 
From (83) we see that the distribution of output noise without the post- 
detection filter depends on the frequency sent, because of the factor 
(22 + o/c?) in the exponent; the “spread” of the probability density 
will be roughly twice as great at the ends of the band than at the center, 
and thus without a post-detection filter one would not choose the fre- 
quencies to be equally spaced. We have seen that there is no such de- 
pendence of the error rate exponent on the transmitted frequency when 
the post-detection filter is present. 


VIII. AN APPARENT PARADOX 


At this point we have basically concluded our discussion of error rates 
in digital FM, based in part upon the theory of “clicks” in FM receivers. 
In particular, we have seen in Section VI that even when frequencies 
were widely spaced so that wa7’ is many multiples of 27 the error per- 
formance did not improve. The reason was noted to be that although the 
distance between frequencies increased, the noise admitted to the system 
increased by a corresponding factor. The latter is predicated on the 
assumption that the input bandpass filter is essentially a flat filter up 
to some cutoff frequency determined by the signal spectrum. It may be 
possible, however, to shape the front-end filter so that increasing the 
frequency separation does not cause a proportionate increase in the 

* Equation (83) represents only the exponential part of P. Also (83) is true [see 
Ref. 10] only if () — z)2/fe? + 62/o?] < 1. 

t Set @ — 2)? & (f)?, @/o?) & J*(Af)?. 
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noise power admitted. We know that the power spectrum of the trans- 
mitted signal will have peaks at the transmitted frequencies of width of 
the order (1/7). Suppose we have a notch filter then, with transmittance 
peaks at the possible frequencies of the appropriate width. The input 
noise power will be constant and therefore by choosing a large enough 
separation one can force the probability of error to be arbitrarily small, 
contradicting optimality considerations for reception of signals against 
a white Gaussian noise background. 

Before giving what we feel is the correct answer to the stated paradox, 
we wish to explore some other considerations which, on the surface, 
might resolve the paradox without changing the basic assumptions of the 
model. One might first object that our argument was too heuristic; is 
the noise power really constant as the frequency separation increases? 
To answer this we have performed the following calculations. We have 
chosen transmitting and receiving filters so that the FM signal is strictly 
undistorted and then optimized the filters to minimize the variance of 
the noise admitted. This procedure is discussed in Ref. 11, and the results 
depend on the power spectrum of the noise. We then specialize to a 
binary system and, using (48) of Ref. 13 for the spectral density of a 
binary FSK wave train, calculate the noise admitted. The result shows 
that while the noise admitted does, in fact, increase as the frequency 
separation increases, it does so only logarithmically with the separation. 
Thus, the error probability still will decrease to an arbitrarily small 
value as the separation increases and from this point of view the question 
is still unresolved. 

A second consideration is the following. The probability of error that 
we have calculated was based on asymptotic approximations to formulas 
given in Ref. 6. The results depended only on the amplitude of the re- 
ceived I'M wave and the average noise power o? at the input to the 
limiter-discriminator; if one allows transmitting and receiving filters 
the more relevant parameters are the average signal power on the line, 
Pine, and o?. However, the exact formulas of Rice also involve the 
quantity o? which is the average power in the derivative of the noise at 
the input to the limiter-discriminator (after the receiving filter).* Let 
S(w) be the signal spectral density and F(w) the transmittance of the 
receiving filter. Further, let us insist that the signal at the input to the 
limiter-discriminator be exactly the FSK wave described,f so the trans- 
mitting filter is the inverse of the receiving filter. We then have for a 
white noise background of Not 
 * Rice, Ref. 6, uses the parameter r = (1/27) (¢/c). 

+ We emphasize that continuous phase at frequency transition is demanded, but 


nothing more. 
t It is for such a noise background that the optimum results are known. 
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o = ae | F(w) he dw (34a) 
ae xe [.« 2) Pw) F de (34b) 
eee oe eee 

Prine = On bee | F(a) po , (34c) 


When one realizes that the spectrum of an FSK wave decreases at 
infinity like the fourth power of the frequency,” (34b) and (84c) imply 
that o? and the line power Pijn- cannot both be finite. Thus, suppose o? 
is finite. The convergence of the integral in (34b) implies that | F(w) |? 
must decrease at least like 1/w’, ¢ > 0. The integral for Piine will, 


for large w, look like 
/ ne wo “dw 
(3) 


which diverges. Likewise, the assumption of finite line power implies o? 
is infinite. An infinite o? certainly violates the conditions under which 
‘the asymptotic results of Rice’s formulas hold. In particular, these 
formulae show that an infinite c? corresponds to an infinite average 
number of clicks per second (assuming such a language is still possible) 
and the I'M discriminator will not work, in the strict sense. On the other 
hand, if we choose the evil of infinite line power then perfect performance 
is not surprising. 

While the above theorem about o,¢, Piine is true from a mathematical 

point of view, it is almost irrelevant from an engineering point of view 
because it involves discussions of infinitely large frequencies, and does 
‘not really eliminate the paradox at all. We need merely precede the 
limiter-discriminator with a flat filter with a cutoff so high that the 
signal is almost undistorted. Since real discriminators work, this is not 
an unreasonable thing to assume. Now a is finite, and although we may 
have to go to extremely largeS/N ratios, the paradox is as entrenched as 
ever. 

The resolution of the problem lies in a reinterpretation of Rice’s 
calculation of the average number of crossings of the negative x-axis. 
We had assumed each crossing corresponds to an encirclement of the’ 
origin which is independent of all past and future encirclements. This is 
reasonable when the receiving filter is essentially flat across the whole 
received spectrum and the correlation time out of the receiving filter is 
small (~1/W). However, if the input noise spectrum is chopped into a 
few slits or notches, correlations in the noise being processed in the de- 


1530 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1966 


tector will persist for a longer time and multiple encirclements of the 
origin can occur with essentially the same probability that one would 
normally associate with a single large excursion close to the origin. 

To make our arguments more precise we consider a binary situation 
at almost zero rate, i.e., we have very narrow filters F; and FP, about the 
frequencies (we + wa) and (we — wa), respectively. The bandwidth of 
these individual filters is of order 1/7’. The noise out of Ff; and Fs. can 


be written as 
ni(t) = Nix(t) cos (we + walt — Ny(E) sin (we + wa)t 
not) = Naz(t) cos (we — walt — Ny(t) sin (w, — wat, 


(35) 


where 7;(t), etc., are independent baseband noise currents. If we assume 
that the frequency (w. + wa) is being transmitted with amplitude A, 
then in a “coordinate system” following that frequency we have 


e= Ap xX 
(36) 
y=Y, 
where 
X = Nz + Nox COS 2wat + Noy Sin Qwat 
(37) 


Y = ny + Ney COS 2watl — Nox SIN Zwat. 


A typical portion of the path that the noise traces out in the zy plane 
can be calculated from (36) and (37) and is shown in Fig. 6. Neglecting 
the time variations of 712(é), etc., which vary on a time scale comparable 


{ R= ¥ (Max) ? + (Nay)? 


y ANGULAR VELOCITY = 2W4 


ANGULAR VELOCITY=2W4q 





Fig. 6 — Small portions of some noise trajectories when receiving filtcr has two 
transmittance peaks. 
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to 7’, we see the path is a circle centered at (A — mz, —Miy), of radius 
VS Noe +4 N,2, and counter-clockwise angular velocity of (2wa). If o? 
and o,” denote the average noise powers out of /, and F: , respectively, 
then the probability P that the circle is appropriately situated with a 
large enough radius to encircle the origin is given exactly by 


P = fz 3 OXP (—p), (38) 


or + 02 a 
where p = A?/2(o,2 + o2?). For the case of a symmetrical spectrum 
about the carrier (a1? = o»?), (88) is comparable to (26). However, our 
circle is rotating with frequency 2f2 and will have a constant radius for 
about 7’ seconds; thus, it will complete 2f;T revolutions in time 7’. As 
the frequencies are spread and notched filters are used the noise indeed 
does not increase proportionally, but the number of multiple encircle- 
ments of the origin that a click will make does increase as the separation. 
Thus, the filter shaping under discussion will affect the statistical struc- 
ture of the clicks, preventing a violation of optimality. 

Note added in Proof. A discussion of the click contribution to the error 
rate has been given very recently by J. Klapper in the RCA Review, 
June, 1966. 


APPENDIX 


Asymptotic Behavior of P, 


We wish to record here an outline* of the details of the evaluation of 
(16) for large S/N so as to obtain the results given in (17). If we set 


p(r) = * exp (—p) + 4/2 cos exp (—p sin’ x) 


(39) 
-erf (~/p cos 2), 
then according to (16) the required probability is written as 
(mr /2—-¢) (7/2) 
Pe= [dy deply)p(a). (40) 
— (mr /2) (yt ¢) : 
If we define the distribution function 
g 
P(é) = i: 2 ply)dy (41) 


* We do not explain the techniques used here for the asymptotic evaluation of 
integrals. The interested reader may wish to consult the eee ‘saddle point 
method,” ‘‘Laplace’s method,” ‘‘Watson’s lemma”’ in Ref. 2 
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and perform and integration by parts, (40) becomes 
a /2—-¢9 
P= [ Pludely + vay. (42) 


Our evaluation will be based upon approximating the functions P(y) 
and p(y) when p is large. In particular, from (89) we see 


ply) ~ i? cos y exp (—p sin’y), (43) 
Tv 
provided y is not close to +7/2. Integrating (48) yields 
Ply) ~ Slerf Vp + erf (Wp sin y)], (44) 


which will be a good approximation for large p except when y is near 
—7/2. These exceptional points will receive special consideration. 

As a first example consider the case when g < 7/2. The integrand for 
(42) is shown symbolically in Fig. 7. Consider the contribution first from 
negative y. This is from (42), (48), and (44) 


: - & / 
les dylerf Vp — erf (vp sin | y|)] 4/ = cos (y + ¢) (45) 


-exp [—p sin” (y + ¢)]. 
Next, approximate erf ~/p by unity to obtain 





pee ig Le - Te T 
ae ae a 277 3 y 


Fig. 7 — Symbolic representation of the factors in the integrand of (42) drawn 
for g < 7/2. 
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is 
a dy erfe (~/p sin | y |) 4/2 008 (y +6) 
2 J- (9/2) 7 


(46) 
-exp [—p sin” (y + ¢)] 
and use the asymptotic expansion 
1 
erfe zt ~ Vag OP (—2”*). (47) 
The resultant integrand has a saddle point at y = —¢g/2, and a routine 


saddle point evaluation will yield (17a) of the text. It is easy to verify 
that the error made by replacing erf ~/p by unity in (45) creates an 
asymptotically small error. Likewise the neglect of positive y is asymptot- 
ically small for 


i dyP(y)ply +¢) S ee dyp(y+¢) S E — e| (2) 


A 


T 


(4 /2)—-¢ 
-exp (—p) + I dy /* ps 


-cos (y + y) exp [—p sin’ (y + ¢)] 
., xP [=e sin’ 9] 
2\/xp sin ¢ 
by Laplace’s method. For g < 2/2 we have 
2 sin? (@/2) < sin? 


which proves our point. The addition of the term (1/7) exp (—p) in 
(48) provides a strict upper bound to p(y + ¢) and thus takes care of 
special considerations at the right end of p(y + ¢). At the left end point 
of the range of integration, (—7/2), p(y + ¢) is still well approximated. 
The function P(y) is, however, approximately 


p(y) SP) a yl, yrs. (49) 
T 2 2 
Using (49) it is easy to obtain an estimate of the contribution of the 
left end point behaving as exp (—p) and this is asymptotically small. 
This ends our discussion for ¢ < 7/2. 

We give a somewhat more condensed outline for ¢ = 2/2. The contri- 
bution of the middle of the range of integration is again approximated 
by (46) with g = 7/2. Using (47) and (46) immediately evaluates to 


[exp (—p)]/4. 
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Next, consider the error made at the right end point. Equation (48) 
holds to within a strip of order 1/+/p from 0, after which p(y + ¢) 
behaves like [exp (—p)]/a. Therefore, the error behaves like 
exp (=p) 1 
2 7 Vp’ 
which is asymptotically small. 
The left end point error is bounded by 


0 - exp (— 
| SP) p) (F +) 4/2siny exp (—p cos*y)dy PIO, 
(rf) ry 2 T T WV xp 


which is again asymptotically small. 

Our final case is ¢ > 7/2, and this time end point contributions will 
not be small. The reason is that if one examines the integral representing 
the contribution from the middle of the range of integration, i.e., 


(x /2)—9 _ 
aa dylerfe ~/p sin | y | — erfe -Vol /* cos (y + ¢) 


(a /2) (50) 


x exp [—p sin’ (y + ¢)], 


it is exponentially dominated by contributions near the end points. But 
in (50) our approximation to P(y) vanishes faster than the correct P(y) 
at y = —7/2, and our approximation to p(y + ¢) vanishes at y = 
(1/2 — ¢) while the true p(y + ¢) does not. This implies that the asymp- 
totic evaluation of (50) will be asymptotically smaller than the correct 
contributions from the ends of the interval. The contribution from the 
right end is 


(m /2)—¢ 2 , = 
at dy [er V/p — erf v/p sin | y | SPS) 
— Cm /2) Tv 


spare, 


~ dr Ve p/p Sin ¢ COs 9" 


The lower limit of integration (51) is immaterial, as will be the upper 
limit in (52). For the contribution from the left end we have 


e? (r/2)—¢ 


— dy Ek + y| /* cos (y + ¢) exp [—psin’ (y+ ¢)] 


T Y(9r/2) (52) 
exp —p(1 + cos? g) | 


~ de az pV p sin y cos’ 9" 
The sum of (52) and (51) yields (17c) of the text. 
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Noise in an F'M System Due to an 
Imperfect Linear Transducer 


By M. L. LIOU 
(Manuscript received February 28, 1966) 


An approach to the calculation of intermodulation noise in FM systems 
due to umperfect transmission media is presented in this paper. The tech- 
nique ts essentially that originating with Carson and Fry. In this paper we 
extend a formulation due to Rice to include an arbitrary continuous pre-em- 
phasis characteristic as well as an arbitrary gain and phase shape transmis- 
ston medium which are representable by low-order polynomial series in radian 
frequency. Series approximations are carried out far enough to ensure ac- 
curate results for transmission characteristics normally encountered in 
broadband microwave radio systems. In many cases, only the second- and 
third-order notse is significant in broadband microwave radio systems. Hence, 
the analysis carried out in this paper considers only the second- and third- 
order distortion terms. A digital computer program concerning the inter- 
modulation noise has been written. This analysis and the digital computer 
program are of aid in the design of microwave radio systems. With a slight 
modtfication, the calculation of noise due to AM-to-PM conversion caused by 
transmission deviation can also be accomplished. An optimum design of 
the pre-emphasis network may be achieved by using the computer programs 
through an iterative approach. 


I. INTRODUCTION 


In the course of designing FM systems, intermodulation noise is an 
important factor which deserves special attention. Many people have 
made contributions to this subject.11° In this paper, we extend their 
results to include an arbitrary continuous pre-emphasis characteristic 
as well as an arbitrary gain and phase shape transmission medium which 
are representable by low-order polynomial series in radian frequency. 
Series approximations are carried out far enough to ensure accurate 
results for transmission characteristics normally encountered in broad- 
band microwave radio systems. The multichannel baseband signal of an 
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FM system is represented by a band of gaussian random noise with flat 
power-density spectrum. The noise due to imperfect transmission me- 
dium can be calculated at any frequency in the baseband. In many 
cases, only the second- and third-order noise is significant in broadband 
microwave radio systems. Hence, the analysis carried out in this paper 
considers only the second- and third-order distortion terms. Extension 
to a higher-order distortion becomes unmanageable. A digital computer 
program concerning the intermodulation noise has been written. A 
typical problem can be solved at a very low cost. This analysis and the 
digital computer program are of aid in the design of microwave radio 
systems. With a slight modification, the calculation of noise due to 
AM-to-PM conversion caused by transmission deviation can also be 
accomplished. An optimum design of the pre-emphasis network may be 
achieved by using the computer programs through an iterative approach. 
Several examples are given for illustration. 


II. DESCRIPTION OF SYSTEM 


A portion of an FM system can be represented by the block diagram 
shown in Fig. 1. An FM baseband signal, ¢,’ (¢) = dy; (t)/dt, is fed into 
a pre-emphasis network. The output of the pre-emphasis network is 
the pre-emphasized signal y (t). This signal passes through an FM 
modulator to yield the FM wave, cos [w.¢ + ¢ (¢)] where w, is the angular 
carrier frequency. When this FM wave goes through an imperfect 
transmission medium, the output is distorted and becomes V(t) cos 
[wet + go(t)]. Let the transfer function of the transmission medium be 


Y(w) = exp [-a(f) — 1B (F)], (1) 


where w = 2zf. Its impulse response is 


ioe) 
g(t) = | Y(w) exp (iat) a (2) 
—00 
BASEBAND PRE- EMPHASIZED FM WAVE TA FM ah 
7 BASEBAND SIGNAL cos[wct+ g(t)}| V(t) cos [wet+ Polt) 
gi (t) ¢'(t) [wet P( | [ .. Po ] 
\ 


\ 


















PRE-EMPHASIS 
NETWORK 


OB 


TRANSMISSION 
MEDIUM 
/) 

w 


-Fig. 1— Block diagram of a portion of an FM system. 
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Using the complex notation, the input and output of the transmission 
medium can be written, respectively, as 
vi(t) = exp [twt + ip (¢)] (3) 
and 
vo(t) = Vt) exp [twet + ago (t)], (4) 


where V(é) is taken to be positive and go(¢t) is determined to within 
2n7, n is an integer. 

The output and input of the transmission medium are related by the 
convolution integral 


vo(t) = c vi(t — x) g(x) de. (5) 
From (38), (4), and (5) 


V(t) exp [¢go(t)] = iz exp [ig (t — x) — twa] g(x) dx. —- (6) 


Let 
V(t) = exp [a(¢)]. (7) 
Then 
a(t) = Re In [ exp [ig(t — x) — iwer] g(x) dx (8) 
go(t) = Im In [ exp [ie(t — x) — twa] g(x) dz. (9) 


The AM distortion term expressed in dB is 20 X 0.4343 a(¢) and the 
PM distortion term expressed in radians (or degrees) is go(t) — g(t). 

Assume that the transmission medium passes only frequencies in 
the neighborhood of the carrier frequency, +f. + b, with b/f. « 1. 
Thus, to a high degree of approximation, we have 


exp (—iw.x) g(x) & i. Y(we + w) exp (twa) df. (10) 
Let 
k(x) = Fa) exp (—iwer) g(2). (11) 


Substituting (1) and (11) into (8) and (9) we obtain 
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a(t) 


—a(fe) + Re In % exp [g(t — x)] k(x) dx (12) 


gold) = —B(f-) + Imin | exp big(t — 2)] (x) de. (18) 


The quantity k(x) may be regarded as the normalized envelope func- 
tion of the impulse response g(x). We shall discuss now the logarithm 
of the integral in (12) and (18) in detail. 
UI. DERIVATION OF DISTORTION TERMS 
Let 
exp [ie(t — x)] = M(x). (14) 


A delay ta is often introduced in order to improve the degree of approxi- 
mation of various series with the first few terms. Then M(t,x) can be 
expanded about « = fa as 


M(t) = M(tta) +2702 mez) 


aes (15) 
cee Ee Miz) | vee 
One choice of fg is 
_ 7 dp(f) 
= Re ‘i: Ge = 2. | oe 7 Noss (16) 


Using (14) and (15), the integral in (12) and (13) can thus be expressed 
as 


[ow Wiel — 2) boda = | Facey |, ca) 
— n=0 7: ox” td 
where m, is the nth moment of k(x) defined as 

Mn = i (x — ta)” k(x)dz, m = 1 (18) 


or equivalently, 


Moat Lat l 

= i ; 19 
m ViGw lage Y(we + w) exp (twta) 4 (19) 
The series (17) is equivalent to the Carson-Fry series with delay t. 
It may not converge in certain cases. However, when the characteristic 
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of a transmission medium can be truly represented by a polynomial, 


from (19), the higher moments become zero and the series reduces to a 
polynomial. The logarithm of (17) can be written as 


in [ exptio(t — x)] k(x)de = io(t — &) 


Ss 1)" (20) 
4+ In E fi ay or B — i) |, 
where 
i" : a” 
R= Sy ep eS ol = uta) | 
Ox z=tg 
Using Taylor’s series expression, we have* 
ie) 2 
In (: + d ca(t)2" In!) = im + oi (as — ar) 
3 
+ ai (ag — B8aya2 + 2on*) 
x 
+ Zl (a4 — 4e03 — 3a” + 120;03— 6a;') qeeee, 
With « = —1landa,(t) = m,F,(¢ — ta), substituting the above series 


into (20), after considerable algebra, one obtains an asymptotic series 
which has been written by S. O. Rice in an unpublished work as 


in [ exp [ip(t — x)] k(x)da = ile cet eee gt Se 
ae 1! 2! 3! 
fe ae Shee a 1 ee 


— $ (ms — 2mm —m2z 
+ 2mi'm2)p"e + -- | (21) 


+ [3 dee” + 20s 
— myma)e'p” — 3 (m 
— myms)p'e" —¥ (ms 


— ms')o” + ge up + +++ J, 


* Notice that this is not the approximation In (1 + y) & y which can be a poor 
approximation and yet has been quite widely used in FM work. 
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where ¢ stands for o(f — tg) and ),’s are the semi-invariants which 
are related to the moments by - 

he = My — my 

\3 = M3 — 38MM, + 2m,° 

4 = Ms, — 4mymg — 3m” + 12mm. — Bm". 


Taking the real and imaginary parts of (21) and substituting the result 
into (12) and (13) gives, respectively, 


Mo; M3: May 
a(t) = —a(f.) + mig’ — aI g + aI yg = oa i 
dai een 8 2 Mt — y” at aa ‘9 Mt (22) 
lan poe lsy 112, Nar 14 
ere ge age oe 
got) = —B(f-) +e — mre + ore . — ore ‘ +e" = 
Agr 13 hip 72 v1 dex 12 lo; , ow ls; 7 ot 
Gee See Hee ee a Oe (23) 
les no, Nae om 
8 go o+ 54 eg + ’ 


where the subscripts r and 7 denote the real and imaginary parts of the 
corresponding coefficients and 


ly = mM, — 2MM3 — me + QWmyme ) 

lz = mg — MyM2 , 

ls = Ms — MM3 « 

ls = mM, — me. 
From (22) and (23), amplitude and phase distortions are divided into 
linear, second- and third-order terms and are shown in Table I. The 
terms —a(f-) and —@(f.-) in (22) and (23) do not appear in Table I. 


This is due to the fact that they are constants and introduce only con- 
stant amounts of amplitude and phase distortions. 


IV. PRE-EMPHASIS CHARACTERISTIC 


The multichannel baseband signal of an FM system is represented 
by a band of random noise. Assuming that the bottom baseband fre- 
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TABLE [— AMPLITUDE AND PHASE DisrortTIons DUE TO 
IMPERFECT TRANSMISSION MEDIUM 








Order of Distortion Amplitude Distortion a(é) 
. Mei Msi Mai 
Li Mii att Soh ID A eh gt 
near Terr J arr Aelia a a 
re ls, ls ls, 
Second-order —— tay tiple Gig a gllt A568 
2 6 8 
. r l 
Third-order is 3° 13 1 a ila eas 
Phase Distortion go(t) — ¢(?) 
‘ Mae; ™: 
Linear —mrg’ + — oe" — = gl” ae ie grt is 
2! gic 
_ de L ls Ls; 
Second-order — yf? ae SE a OP a 
of, 6 8 
: Xr L 
Third-order aot — gf — 700" as <aiks 


6 





quency is much smaller than the top baseband frequency, the power- 
density spectrum of the baseband FM signal is expressed as 


So (w) = Po, [FS hes 


where f; is the top baseband frequency. 

A pre-emphasis network is used in an FM system in order to optimize 
the noise across the baseband. Let Z(w) be the transfer function of the 
pre-emphasis network, we write 


| Z(w) |* = ao + aof? + asf? + asf’, Ifl She, 


where the a’s are real constants either given a priori or determined by 
the least squares fitting from an actual curve. The power-density spec- 
trum of the pre-emphasized baseband F'M signal is 


Sy (w) = | Ze) [* 8°) 
= Po(ao + af’ + asf’ + asf’), Stes 


In the case when the pre-emphasis coefficients a, a2, a4, and dg 
are determined by the least squares fitting from an actual curve, the 
following weighting function of normalized scale has been found useful 
for better approximations near the bottom channels 


(24) 
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W(f/fo) = [QOL fy—- 


where Z is the difference of relative power (in dB) of top and bottom 
channels of the given pre-emphasis characteristic. 

The rms frequency deviation, o, due to noise loading can be expressed 
as 


(2rc)” = ave [y(t)] = [ Sy'(w) df. 


Using (24), the relation between Py and o can be expressed as 
(220)? 


Py = OST —CsC rad /see)”/ Hy, 
asa + (25*) + (25) + (4) | 


where the units of o and f, are Hz. 





V. TRANSMISSION MEDIUM 


Within the band of interest, f. + 6, let the gain and phase of the 
transmission medium be, respectively, 


exp [—a(f + fe] = 1+ giw + gow” + gow + quo” 


N (25) 
+ »» Ux COS (piw + 1) 


N 
—B(f + fe) = bow + dye? + dywt + > v, Sin (qew + ox), (26) 


where the g’s and 6’s represent coarse shape transmission deviation; 
the w’s and v’s represent fine shape transmission deviation. It should be 
emphasized that the fine shape transmission deviation is restricted to 
be a slowly varying ripple characteristic, hence this analysis does not 
apply to the noise due to single echo in general. The fine shape representa- 
tion is useful when we study the effect on the noise of a given system 
due to the shift of the carrier. Since a constant delay does not introduce 
intermodulation noise, the linear term in w is not included in (26). The 
transmission medium coefficients g, b, u, v, p, g, 8, and o in (25) and (26) 
are either given directly or obtained approximately by a curve-fitting 
computer program. 
Substituting (26) into (16), we have 


N 
ta=—- > UQe COS ox. 
k=1 
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From (19), we can evaluate various moments, hence, the coefficients 
associated with the distortion terms in Table I in terms of transmission 
medium coefficients, as given in Appendix A. In the following sections 
we shall derive expressions for intermodulation noise calculation due to 
second- and third-order distortion terms. 


VI. NOISE POWER DUE TO SECOND-ORDER DISTORTION TERM 


Since 
dn 
file ot 2 + ow 


2 
- y” — 209" = 20"? 


the second-order PM distortion in Table I can be written approximately 
as 


d ad\ , 
v(t) = (-s + fle; di a yels: t) Y : + gyliye””, (27) 
where 
ls; = Als; — Sls: . 


The second-order PM distortion term of (27) can be represented by the 
block diagram shown in Fig. 2, where 


Hy(w) = (gels — fro) + iGhiw), 
Ae(w) = gala . 

The power-density spectrum of g(t) is” 

So. (w) = Hi(—w)Hi(w)Sy2(@) + Hi (—@) Hs (w) So'29'”2 (w) 


(28 ) 
+ H2(—w)Hi(w)So29'2 (w) + H2(—w) Haw) Sor2 (w), 





Fig. 2— Block diagram representation of the second-order PM distortion term. 
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where Syo(w) and S,2(w) are the power-density spectra of gv’ and 
g”*, respectively, and Syreypre(w) for Syrregrn(w)| is the cross-power- 
density spectrum of yg” and ¢”? (or ¢”? and ¢”). These quantities can be 
derived as! 


Spi2(w) = §[2Ry’" (7) + Re (0)I, 
Spr2g'2(w) = F2WRyrye (7) + Ry (0) Ro (OI, 
Sor'29'2 (w) S(2Rerg? (7) + Ry (0)Ry 0), 

Sow) = F2Ry? (7) + Ry ODI, 


where Ry(r) and Ry (r) are the autocorrelation functions of gy’ and 

¢”", mespectivel and Re y(T) [or Rgy(7)].18 the cross-correlation func- 

tion of g’ and ¢” (or ¢” and ¢’), and & stands for “the Fourier transform 
of”. 


The Fourier transform of a constant function is a delta function at 
zero frequency. In the situation of evaluating noise power in the base- 
band, this quantity is not of interest. Also, it can be shown that 


Sertgtt(w) = Spttg's(w). 
Thus, (28) can be simplified as 
Seo(w) = 2| Hie) | [Ry (r)] + 2 | He (e) | *S[Rer? (7) 
+ 2[fi(—w)He(o) + He(—o) Hi () 15 [Re e" (7)]. 


The intermodulation noise to signal power ratio due to the second. order 
distortion term expressed in dB is, therefore, 


N./S(dB) = 10 log [w’S,, (w)/S, (w)]. (30) 


(29) 


VII. NOISE POWER DUE TO THIRD-ORDER DISTORTION TERM 
Since 


d sigs ” 
ae = ee 


the third-order PM distortion in Table I can be written approximately as 


d\ 1 
g3(t) — (2% is ral, =) Y * 


The above equation can be represented by the block diagram shown in 
Fig. 8, where 
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3 


Fig. 3— Block diagram representation of the third-order PM distortion term. 


A3(w) = (§Aar) + t(— Palin). 
The power-density spectrum of ¢3(é) is 
So, (w) = | Hs() | *Sors (o), 


where S,3(w) is the power-density spectrum of y’ which can be derived 
asl? 


Sy'3(w) = 65[Ry*(r)] + OR,’ (0) Sy (w). 


In the above equation, the term 9R,’" (0) S, (w) is merely a scaled power- 
density spectrum of the input baseband FM signal, hence, it does not 
contribute to the intermodulation noise and can be neglected in the 
computation. 

The intermodulation noise to signal power ratio due to the third-order 
distortion term expressed in dB is, therefore, 


N3/S(dB) = 10 log [u'S,, (w)/Se (w)], (31) 


where 


Ses (w) = 6 | Hs(w) | *S[Ry*(r)]. (32) 


In (29) and (32), the Fourier transforms of Ry’(r), Ry"(r), Ryre'(r), 
and R,-'(r) may be obtained by taking the convolutions in the frequency 
domain. However, this requires numerical integration. For given pre- 
emphasis characteristics and Py (or o), these Fourier transforms can be 
expressed in algebraic forms as shown in Appendix B. Hence, no nu- 
merical integration is necessary. A digital computer program has been 
written to calculate the second- and third-order intermodulation noise 
due to second- and third-order distortion terms in dB. A typical problem 
can be solved at a very low cost. 


VIII. EXAMPLES 


Several examples are considered in this paper. Calculated results are 
compared with measured data when they are available. Expressions for 
noise calculation are derived for simple cases. For more complicated 
situations, the noise calculation is best carried out by using a digital 
computer. 
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8.1 Example 1 


In this example, we wish to demonstrate how the intermodulation 
noise across the baseband can be optimized using an appropriate pre- 
emphasis network. To simplify the calculation, we assume that the 
transmission characteristic consists of linear delay distortion (b2) only. 
From Appendix A, we obtain 


Nei = — 2b. ) lie = —8b.2 
and all the other coefficients are equal to zero. Hence, 
Ay(w) = be ) H2(w) = OQ, H3(w) = 12.2. 


In practical cases, the third-order distortion term [H3(w)] is negligible 
(say, 50 dB less) compared with the second-order distortion term. I'rom 
(30) we write 

Nz _ Qbew F[Ry?(r)] 

Ss (dB) = 10 log Sor Cw) == : 
For simplicity, we let 

Se(w) = Poll + mf), If| Sh. 
From Appendix B, we obtain 


Sl Ry?(r)] = Po'fen(f), 


where 
nf) = —goArO? — 3AM + (FA2 + 2) Ao? 
— (1 + A2)?2 + (PsA? + 3 Ae + 2) 

As = defi? 

Q = f/fo. 
Since 

Pe ener | 
2A + “:) 

consequently, 


(Qa) *(befe’)” (7) O'n(f) 


(1 + ale + A,0*) 


af (dB) = 10 log 
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Specifically, we let 
be = 7.962 X 10-” (linear delay of 1 nanosec/MHz) 


ho 
o = 1 MHz. 


1 MHz 


After several computer runs, the optimal choice of a: is 7. The inter- 
modulation noise with no pre-emphasis and with the optimal pre-empha- 
sis are plotted in Fig. 4. The noise has been reduced to more than 5 dB 
and is evenly distributed across the baseband by using the optimal pre- 
emphasis network. 





”Y 
sl 
Ww 
a 
° OPTIMAL 
a PRE-EMPHASIS 
Zz 
is} 
z|” 





FREQUENCY, f, IN MEGAHERTZ 


Fig. 4 — Intermodulation noise due to linear delay with and without pre- 
emphasis. 


8.2 Mxample 2 


In this example, we use a typical radio system pre-emphasis charac- 
teristic shown in Fig. 5. The top baseband frequency is fp = 5.772 MHz. 
Since the given pre-emphasis characteristic is expressed as relative 
power in dB versus baseband frequency, it is first converted to ratio 
versus normalized baseband frequency, 2 = f/f. 

The weighting function is 


W(a) = io” (Q-1) 


A least squares approximation program is used to obtain the approxi- 
mating polynomial 


ay + Aof? So asf? ss af’, 
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where 
ay = 0.99894166 
d2 = 11.944252/f? 
a4 = —5.5771705/fi4 
dg = 1.4396088/fr®. 


Consider a transmission characteristic consisting of a linear, a parabolic, 
and a slowly varying sinusoidal delay as shown in Irig. 6. The expres- 
sions for the noise due to second- and third-order distortion are too com- 
plicated to write down. However, by using a digital computer, the 
results are plotted in Fig. 7 for o = 0.771 MHz. Clearly, a better 
pre-emphasis network should be used to optimize the noise across the 
base-band for this particular transmission characteristic. 


8.3 Hxample 3 


As a final example, we consider a single pole IT filter with 





? 


Y(w + We) = 
igil 

w 

where w is the 3-dB half-bandwidth of the filter. Using a least squares 

approximation with appropriate weighting function, the magnitude and 

phase of Y(w + w-) are expressed by 





RELATIVE POWER IN DECIBELS 





FREQUENCY, f, IN MEGAHERTZ 


Fig. 5.—Pre-emphasis characteristic of a typical radio system. 
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bs =0.797 x1o7'6 
b3=0.847 x10—24 
o,=7/5 

Qy =2x1077 


DELAY IN NANOSECONDS 





FREQUENCY, f -f, IN MEGAHERTZ 


Fig. 6—An arbitrary delay characteristic of a transmission medium. 














IN DECIBELS 




















0 0.t 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1,0 


Fig. 7 — Intermodulation noise due to the delay distortion of Fig. 6. 
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1 + Gh(f/w)? + Ga(t/w)s 
Bi(t/w) + Bs(f/w), 


exp [—a(f + fe)] S 1 + gow? + guwt 
—Bf + fe) & diw + byw? 


I 


where 
G2 = go(2rw)? = —0.4209, Gs = ga(2rw)* = 0.08027 
B, = bi2Qrw) —0.9529, Bz = b3(2rw)? = 0.1294. 


The actual and approximated transmission characteristic are plotted in 
Tig. 8. Since a constant delay does not introduce intermodulation noise, 
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Fig. 8 — Gain and phase characteristic of a single pole filter. 
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b; is not included for calculation. Using the expressions in Appendix A, 
the coefficients associated with the second-order distortion term are 
her = 0, lo; = 0, ls: = 0, ly; = 0. 
Hence, 
Hy(w) = H2(w) = 0. 


The noise contribution due to second-order distortion is, therefore, zero. 
The coefficients associated with the third-order distortion term are 


A3r = 6b; ) liz = 2494 — Age”. 


Assuming no pre-emphasis, that is, a2 = a, = as = 0, from Appendix 
B, we have 


5[R,3(r)] = PefZac(3 — 22), |[Q| <1 





where 
Q= f/f. 
Using the relation 
2 
—_ (27ro) 
2p b Ao 


the intermodulation noise to signal power ratio due to the third-order 
distortion term can be derived from (31) as 


af (dB) = 10 log (Z) (2) Y (3 — 0) | Be + (£) (20 — &)']. 


Forf, = 1MHzand w = 1.25 MHz, N;3/S (dB) is calculated at f = 0.084, 
0.36, and 1 MHz as a function of (c/f,). The dotted lines in Fig. 9 repre- 
sent the calculated value (S/N3) while the solid curves represent the 
measured data taken by W. F. Bodtmann." The discrepancy between 
the measured and calculated values can be attributed to several reasons: 
(2) The power-density spectrum of the multichannel baseband signal 
used in the experiment was not perfectly rectangular, (27) During the 
measurement, a non-ideal limiter was used which caused some AM-to- 
PM conversion, (277) The actual transmission characteristic was approxi- 
mated by few parameters in a limited region, and (dv) the formulas (30) 
and (31) derived in this paper involved approximations. Nevertheless, 
the measured and calculated results are close enough to show the utility 
of the analysis even for cases beyond the application for which it was 
originally intended. 
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Fig. 9 — Measured and calculated intermodulation noise due to the single 
pole filter. 


IX. DISTORTION DUE TO AM-TO-PM CONVERSION 


Let K be the AM-PM conversion factor of a device expressed in de- 
grees/dB, then the phase distortion due to AM-PM conversion is 


Phase Distortion due to AM-PM Conversion = 20 X 0.4348K a(t) 
8.686K a(t) degrees, 


where a(t) is given by (22). 
Similarly, 


Frequency Distortion due to AM-PM Conversion = a K . a(t) 
Tv 


~ 1382 K 7 ate) He. 


In Table I, the second- and third-order terms of amplitude distortion 
cause intermodulation noise while the linear term causes video roll-off 
or enhancement. Using the same approach discussed previously in this 
paper, noise calculation due to AM-to-PM conversion which is caused 
by transmission deviation can also be accomplished. 
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APPENDIX A 


Coefficients Associated with the Distortion Terms 
The moments defined in (19) can be expressed as 


Mm = 1 


M2 = a Ar) + ¢ (—B”) 
m= (or ES (A 
C 


3B” 4. A) 4 i (2 Bl" 4. a i 545") 





A 
where 
N 
A= 1+ >> um cos % 
k=1 
N 
A’ = _ > UKPk sin 0; 
k=1 


N 
A” = 292 ara > UP cos Oo 
=1 


N 
A” = 693 + 2d UkDE SiN O, 
A 94. ~ 4 
= 249, + pm Uxpe COS 6; 
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uv x 
2. 
BY” = 2%. — D> ge’ sin ox 
k=1 
N 
ait 3 
B’” = 6bs — >> vege’ COS on 
k=1 


wy 
B 


I 


N 
24d, + >> vege’ sin op « 
k=1 


The coefficients associated with the second-order distortion are 


Mu 





Aw = —B 
At AA” 

Br ae eae 

mr . 3A'B'  6A"B” = 3A”B” 
SE ee a gs 

nun, 124”B" A”B’”’ 
la; = B + A =— 12 7 Ce « 

The coefficients associated with the third-order distortion are 
Xs _ BR 
es yh AAS AltA” 
9, 
A ag. ag aS 
APPENDIX B 


Fourter Transforms of Ry*(r), Rorg(t), Ry? (rt) and R,°(r) 


The Fourier transforms of Ry?(7), Ryry (7), Ry ?*(r) and Ry (r) can 
be derived in a straightforward manner. However, considerable amount 
of algebra has been involved during the derivation. Without writing out 
the details here we merely present the final results. 


S[Ry?(r)] = Pofe{ Dif — fo) + 2Di(f) + Dif + 2fe) 
— Df — 2fo) + 2D2(f) — Dif + 2fe) 
+ 2D3(f — 2fo) — 2Da(f + 2fs)} 

S[Ry(7)] = Pofe? {3H — for) + EA — 3f) + Fi(f + 3h) 
+ 3Ei(f + foe) + 3Ei(f — foe) —8Huf — 3fo) 
— 3EA(f + 3fo) + 8B + fo) + 38E:(f — fr) 
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+ 3E3(f — 3fv) 


— 3437f + 3fo) — 323(f + fo) + 3£i(f — fo) 
— Eo(f — 3fc) + Eal(f + 3fo) — 3ha(f + fr)} 
F[ Ree (r)] = 4a? Porf?{ Ii(f — fo) + 2i(f) + Jif + 2fs) — Jo(f— 2fs) 
+ 2d2(f) — Jaf + 2fs) — 2da(f — 2fo) + 2a(f + 2fr)} 
F[Ry(r)] = 16rt Poefe{Kilf — 2fs) + 2Ki(f) + Ki(f + 2fr) 
— K(f — 2fs) + 2K2(f) 











— Ko(f + 2f-) + 2K3(f — 2fr) 





where 
Dif) = > Nea 1)! 
Dif) = ya (05a, = 1) 
Df) = > Ss cay i 
Bf) = a ) 5m Ty 
Bf) =X (-v ge (2 
Bilf) = ya (<D" oe (j, 
BA(f) = 33 ee yam in (i 
Af) = = (“I 5a (, 
( 


J2(f) = Dd (-0)" 55 1)! 


= z — n+l Jan 
J3(f) = 2. Coo ae 2(2n)! \f; 
k 





SIS, 


Soe 


aS) = Le (-D" saa 


—— 
is 5 


— 


MS, 


| 


Na” 


= 


te 
3 


|, 


"ew 


= 
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fet 2(2n — 1)1\fp 


n+l Kn ‘a a 
= 2, (-"" sony (f) end 


The coefficients in the above equations are given as 


K(f) = 2 a (f)" sgn f 


| 
Me 


K3(f) 


dy = or diy = 2¢1C2 , dys = C2 + ere; 

dis = 2C2€3 , dig = C3" 

dy = Ce, dy = 2C4Cs , dog = Cs” + 2c4C5 

dos = 2cscz + cree ) dos = Ce + 2esc7 

dog = 2ceC7, dor = C7 

dz, = CyC4, 32 = CiCs + Cola, das = C1Cg + C23 + C34 
34 = CiC7 + Cog + C3C5, 35 = CoC7 + Cae, 36 = CaCy 

éy = CP, C14 = 3e%C2, Cis = 3C1C2” + 3¢1°Cs 

€16 = GciCocs + C23, e17 = 3¢1€3" + 3e27C3, Cig = 3C2C;” 

C19 = C3°, Cor = Ca, Can = 3E47C5 

C23 = 3CsCs” + ScaCg, C24 = Se4C7 + OcaCsCs + 5° 

€25 = 34? + GesCscr + 3es"eg, C26 = GeaceCr + 3C5¢6? + 3¢5°C7 
C27 = 8c4C7" + Gescecr + Ce°, 2g = 3CsC7” + 3c67C7 

€99 = 3C6C7’, €210 = C7’, €32 = C1°C4 

C33 = 2cyCoCy + C1Cx, C34 = C2°C4 + QeyCsCq + QWiCoCs + C1°Ce 
€35 = 2c2C3Cq + C2°C3 + 2er03C3 + 2crC2e—g + C1°C; 

€36 = €3°Cy + 2C2C3C5 + C2°Cg + 21036 + 21027 

€37 = C3°Cg + 2C2C3C—e + C2°C7 + 2C1€3C7 

€3g = C3°Cg + 2CoC3C7, 39 = C3°C7, Cag = 10g? 

€43 = 2exCsCgs + C2C4?, Cay = C1C5” - 2eyCacg + Wotgcs + C3C2 
€43 = 2C1C4C7 + 2c yC5C— + C25? + 2c3Cacs + 2CoC4Co 

es = Cre” + ZexesCr + 2C2Calz + 2WC2esCg + CsCs” + 2CsCsCe 


C47 = 2CiC6C7 + CoC—” + 2CoCsCz + ZWCsCac7 +- Zc3Csce 


Js 
Js6 
kip 
Kg 
his 
kg 
kay 
Koy 
kos 


Kos 
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c1€7? + 2ceCeC7 + C36? + 2C3C5C7 


Cc?” + 2C3C6C7 , C410 = C367 


= cf, jie = 2cs(c3 — 2c) 


= (Cs — 2¢1)? + 2cs(cg — 42) 


2es(cz — 6e3) + 2(e5 — 2c1) (cs — 4¢2) 

(cs — 42)? + 2(cs — 2c1) (c7 — Ges) 

2(cg — 4c2) (cr — 6¢3), jir = (Cr — 6c3)? 

(cx + c4)?, Jos = 2(e1 + €4) (Co + 86s) 

(c2 + 365)? + 2(ce1 + c4) (cg + 5c¢) 

14¢;(c1 + cs) + 2(e2 + 8e5) (c3 + 5c¢) 

(cz + 5es)? + 14er(co + 3¢5), jor = 14e7(e3 + 55) 

497, Js1 = ca(cr + 1) 

Ca(Co + 8¢5) + (C1 + 4) (C5 — 2¢1) 

ca(Cz + 56) + (c2 + 30s) (C5 — 2c1) + (c1 + 1) (Cg — 42) 


= TC4C7 ote (5 = 2¢1) (C3 + 566) + (C6 = 4cp) (C2 + 365) 


+ (cr — Ges) (1 + ea) 
7¢7(¢s — 21) + (cs + 5c6) (ce — 4¢2) + (C2 + 865) (er — Ges) 
7¢7(Ce — 4¢€2) + (c3 + 5c) (er — 6e3), Jar = Ter(cr — Ges) 
(ec, + 2c4)?, kis = —2(e1 + 2c4) (601 — C2 — 65) 


= (6c, — C2 — 665)? — 2(e, + 2c4) (20€2 — c3 — 10c¢) 


2(6c1 — C2 — 6¢s5) (20c2 — cz; — 10¢6) — 2(e1 + 2c4) (42c3 — 147) 
(20c2. — cz — 10c4)? + 2(6¢, — C2 — 6e5) (42c3 — 14c7) 

2(20c2 — c3 — 10cs) (42c3 — 14cz), has = (42c3 — 14c7)? 

cr, ko = —2es(4e1 — 3 -+ 2c1) 

(4c, — 5 + 2c1)? — 2es(8e2 + 12c5 — c¢) 

2(4e1 — cs + 2c4) (8c2 + 12¢5 — cg) — 2es(12c3 + 380c5. — cz) 


kos = (8¢2 + 12c5 — cg)? — 112cqc7 + 2(4e1 — c3 + 2c4) (12c3 + 3066 — €7) 
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Iog = 112c;(4c1 — cs + 2c4) + 2(8c2 + 12c3 — cg) (12c3 + 30c6 — c7) 
Koz = (12c3 + 30c— — c7)? + 112c7(8ce2 + 12¢3; — ce) 
Iog = 112c;(12c3 + 380¢5 — C7), ko9 = (56c7)? 
ksy = ca(Cy + 2c4) 
ksx = —(e1 + 2c4) (4c, — €3 + 24) — cr(6c1 — C2 — 66s) 
kag = —(er + 2c4) (Sco + 12cs — cs) + (61 — Co — Ges) (4¢1 — Cs + 2¢4) 
— c4(20c2 — cz — 10¢e¢) 
koa = —(er + 2c4) (12€3 + 30c5 — c7) + (6c1 — C2 — Ges) (82 + 12c5 — ce) 
+ (20c. — cz — 10c5) (4e1 — cs + 2c) — c4(42c3 — 147) 
kgs = —56er(c, + 2c4) + (6c1 — ce — 6¢5) (12c3 + 30¢6 — C7) 
+ (20c2 — cz — 10c¢) (Sco + 12c3 — Cg) 
+ (42c3; — 14¢7) (4e1 — ¢s + 2c4) 
kze = 56c7(6c, — co — 6c5) + (20c2 — cz — 10c¢6) (12c3 + 30¢6 — cz) 
+ (42c3 — 147) (8cz + 12c5 — Cs) 
kg7 = 56c7(20c2 — cz — 10cg) + (42c3 — 14c7) (12c3; + 30c5 — cz) 
ksg = 56c7(42c3 — 14c7), 


where 

C1 = 2(dof? + Zasfo? + 3a6fo°) 

Co = —24(asfot + Sas ft), cs = T20ache® 
Cy = Ao + Aofe? + asfot + ac feo 

C5 = —2(acfe? + Gasfet + 15a6ft*) 

Ce = 2A(asfrt + 15a6f®), cr = —T20ach'. 
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Bounds for Certain 
Multiprocessing Anomalies 


By R. L. GRAHAM 
(Manuscript received July 11, 1966) 


It is known that in multiprocessing systems composed of many identical 
processing units operating in parallel, certain timing anomalies may 
occur; @.g., an increase in the number of processing units can cause an 
increase tn the total length of time needed to process a fixed set of tasks. 
In this paper, precise bounds are derived for several anomalies of this type. 


I. INTRODUCTION 


In recent years there has been increased interest in the study of the 
potential advantages afforded by the use of a computer with many 
processors in parallel. While it is generally true that a set of tasks may 
be processed in less time by this type of multiprocessing, it has been 
pointed out that certain anomalies!” may occur, even though the proces- 
sors are used in a very “natural” way (e.g., it can happen that increasing 
the number of processors can increase the time required to complete a 
given set of tasks). 

It is the purpose of this paper to derive precise bounds on the extent 
to which these anomalies can affect the time required to process a set 
of tasks, given certain rather natural rules for the operation of the 
multiprocessing system. 


1.1 Description of the System 


Let us suppose that we are given n identical processing units P; , 
1 < iS nn, and a set of tasks T = {71,--- , Tm} to be processed by 
the P;. We are also given a partial-order* < on 7 and a function uy: 
T — [0,). Once a processor P; begins a task 7; , it works without 
interruption on 7; until completion of that task, taking altogether 
u(T;) units of time. It is also required that if 7; < 7; then 7; cannot 


* See Ref. 2. 
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be started until 7; is completed. The P; execute the 7; in the following 
way: We are given a linear ordering L: (T;x,, --:, Tx,,) of T called a 
task list (or priority list). In general, at any time ¢ a P; completes a 
task, it immediately (and instantaneously) scans the list L (starting 
from the beginning) until it comes to the first task 7; which has not 
yet begun to be executed. If all the predecessors of T;; (i.e., those 77; < T;) 
have been completed by time ¢ then P; begins working on 7’; . Otherwise 
P;, proceeds to the neat task Tj, in L which has not yet begun to be 
executed, etc. If P; proceeds through the entire list LZ without finding 
a task to execute then P; becomes zdle (we shall also say that P; is 
working on an empty task). P; remains idle until some other P; com- 
pletes a task at which time P; (and of course P;) immediately scans the 
list L as before for possible tasks to execute. If two processors P; and 
P;,7 <j, simultaneously attempt to begin the same task 7, , it will be 
our convention to assign 7; to P;, the processor with the smaller index. 
The processors all start scanning L at time ¢ = 0 and proceed in the 
above-mentioned fashion until some time w, the least time for which 
all the tasks have been completed. 

It will be helpful here to consider several examples. We shall indicate 
the partial-order < on TJ and the function pu by a directed graph G( <,u). 
In G(<,u), the vertices will correspond to the 7; and a directed edge 
from 7; to 7; will indicate that 7; < 7T;. Each vertex of G(<,y) will 
actually be labelled with the symbol 7;/u(7';), the »(7;) indicating the 
time necessary to execute 7';. The activity of each P; is conveniently 
represented by a taming diagram G (also known as a Gantt diagram; 
see Ref. 1). G will consist of n horizontal half-lines (labelled by the P;) 
in which each line is subdivided into segments* and labelled according 
to the state of the corresponding processor. 


Example 1: n = 3, L: (173, 71, Te, Ts, To, Ts, Tr, Ts) 


T,/4 T2/3 

74/5 = 73/1 7/3 
G(<,n): 

T/2 T,/2 2/3 


Oo _—__—_ 


* We always consider the segments as being closed on the left and open on the 
right. 
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The symbol ¢; indicates a processor is idle (i.e., working on the empty 
task ¢;) but not all the 7; have been completed. The indexing of the 9; 
is arbitrary. Thus, for G we have w = 9. 


Example 2: n = 4, £: (11, Te, Ts, Ts, Ts) 
T2/5 


G(< jp): Ti/4 T3/1 T;/4 





T,/3 





Here, w = 13. Note that in this example, w is independent of L. We 
should also point out here that we are using the convention that when- 
ever any 7’; is completed, then all current empty tasks ¢; are also termi- 
nated. Processors still idle are then given “new” empty tasks to com- 
plete (e.g., Py in Example 2). 
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Example 3: n = 3, L:(T1, Te, T3, Ts, Ts, T6, T7) 


Ti/1 . 

T./1 . . 75/1 
G(<,u): 

Oy ee . Te/1 

Ts/1 . . 77/3 





Ps 


Suppose we use a different list L’ given by L’: (71, Te, T7, Ts, 
Ts, Ts, T«). We then have 


Ti Ts , 7s 





Hence, by simply using a different list ZL’, we have shortened w by nearly 
a factor of two. The significance of this and similar examples will be 
brought out in the next section. ; 
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We see that, in general, w is a function of the task list L, the “time” 
function p, the partial-order <, and the number of processors n (in 
addition to the rules under which the P; operate). In this note, we 
investigate the factor by which w can increase if we simultaneously: 

(7) Change* the task list L; 
(i172) Decrease the function yp; 

(4i7) Relax the partial-order <; 

(iv) Change the number of processors from n to n’. 

While it might first be expected that (72), (277), or (dv) (with n’ > n) 
would cause a decrease in w, easy counterexamplest show that is not 
always the case. In the next section we obtain an upper bound on the 
factor by which w can increase because of (2), (i), (zz), and (zv) (ef. 
Theorem, p. 1571). This bound is just the expression 1 + » — 1/n’. We 
also show that this bound is the best possible in the sense that it cannot 
be replaced by any smaller function of n and n’. 


II. THE MAIN RESULTS 


We begin this section by considering a special case of the general 
problem. We include this here in order to acquaint the reader with the 
basic ideas which will be used later. Suppose we are given a set of tasks 
T = {7T1,°-:, Tm} and a directed graph G(<,u) giving a partial-order 
< and a time function » on 7’. We execute these tasks twice, each time 
using two identical processors P; and P2. The first time the tasks are 
executed we use a task list Z while the second time the tasks are exe- 
cuted we use another task list ZL’. Suppose the corresponding finishing 
times are w and w’. The question we consider now is this: How much 
can the ratio w’/w vary? This is answered by the following 


/ 
Proposition:  S ©; S 3. 


Proof: By the symmetry of w and w’ it suffices to show that w’/w  ¢. 
The basic idea we shall use is a simple one. Consider the timing diagram 
G obtained when the tasks are executed using the list Z. We want to 
show that there is a chain{ of tasks T., < T.. < --- < To, which 
has the property that whenever a processor is idle (i.e., executing an 
empty task 9;) then the other processor is executing one of the T., . 

-* By “change” we mean “‘possibly change’’, etc. 


+ As far as the author is aware, these facts were first pointed out by Richards.? 
ti.e., a linearly-ordered subset using the partial-order <. 
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To define the 7, we proceed as follows. First, let 7;, be defined to 
be the task which has the latest finishing time in G (if there is more 
than one such task then we choose the task which is executed by the 
higher-indexed processor). Let ¢,, be the empty task which has the 
latest finishing time of all those empty tasks which finish at a time not later 
than the starting time of T;, . By the construction of S, there must be a 
task T,, which has the same finishing time as ¢:, . Define 7T;, to be T,,. 
In general, suppose we have defined 7';, for some k = 2. To define T;;,,, , 
let gz, be the empty task which has the latest finishing time of all those 
empty tasks ¢; which finish at a time not later than the starting time 
of T;,. (If there are no such g; then we are done, ie., 7'j,,, 1s not de- 
fined.) By hypothesis, there must be a task 7, which has the same 
finishing time as ¢;,,, and which has a starting time not later than the 
starting time of ¢;,,,. Define T;,,, to be 7. We continue this algo- 
rithm for as long as possible, say, until we have defined T;,, ---, Tj, . 

We first note that since no processor works on one empty task ¢; 
while the other processor works on more than one task, then at any time 
@ processor is executing an empty task, y; , the other processor is executing 
one of the T;,. We next claim that T;,,, < Ty, for 1 S k < r. Suppose 
this is not the case. If ¢, denotes the time at which a processor P; started 
executing ¢;,,, then by the hypothesis concerning the operation of the 
processors, P; should not have been idle (i.e., working on ¢;,,,) since 
at least one task, namely 7';, , was eligible to be executed at that time. 
Thus, the timing diagram G is not valid and we have a contradiction. 
Hence, we must have T;,,, < J;, for 1 S$ k < r. By defining 7, = 
T,., for 1 S k S 1, the first assertion is proved. It follows at once 
that if we let u(¢;) denote the length of time a processor spends executing 
¢g:, then 


dD u(gs) =>» WoL G): (1) 


¥i€S k 


The proof of the proposition now follows directly. Let T:;, < Ti, < 
- < T;, be chosen (by the assertion just established) so that 


s§ 


> we’) S > wT), (2) 


eir€G: k=1 


where the ¢,’ are taken from S’ (the timing diagram obtained when the 
list L’ is used). Note that w’ can be written as: 


wo =% Dd) wT.) + DD uli’). (3) 
TEET biG 
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From (2) and (3) we have 


sa( Dat) +E ales). (4) 
T,EeT k=1 
Since the following inequalities hold: 
wo 2h D>) w(T,) (5) 
TRET 
o= dy u(T;,) (6) 
(where (6) follows from the fact that T;, < T;, < --- < T:,), then we 


have from (4), (5), and (6) 
w S4(Q0-+a) = 2 


and the proposition follows. 
The following example shows that the upper bound of 2 cannot be 
replaced by any smaller value. 


Example 4: n = 2, L: (11, Ts, T2), L’: (11, Te, Ts) 


T;/1 


G( < bt) : T2/1 


T';/2 
Ti Ts 
£n 1 1 
G wo = 2 
Py x 
P, (on Ts 
Grr w = 3 





Py, 1 2 


Therefore, w/w = 3 and the upper bound of the proposition is achieved. 
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Before stating the main theorem we introduce some notation. Let 
T ={T1,---, Tm} be aset of tasks. Let G = G(<,u) and G’ = G’(<',u’) 
be two directed graphs for JT with the partial-orders <,<’ and the time 
functions y,y’. We say that G S G’ if: 

(t) pw’ S pie. w’(7;) S u(T;) for all 7; € T. 

(it) <’ € <, ie, T; <’ T; implies T; < 7; for all 7;, T; € T. 
Finally, suppose we execute the tasks éwice, one time using the graph 
G, a task list LZ and n processors, the other time using the graph G’, a 
task list L’ and n’ processors. Let w and w’ denote the respective finish- 
ing times. We then have the 





Theorem: If G’ S G then 


w n—1 
ee | . 
wh Re n! 





Proof: By a slight modification of the argument used in the proposition, 
it follows that if ¢,’, 1 S$ 7 S v, denote the empty tasks of §’ then there 
exists a chain of tasks T;, <’ T;, <’ +++ <’ T;, of tasks in T with 
the property that whenever a processor is idle then some other processor is 
executing one of the T;, . From this we conclude 


> uly) S (n’ — 1) 3 a (Psy) (7) 


pir eG 


As before we note that 


J (5, Hrd + Ew) 
GyrEGr 


TjET 
(8) 
s4( 5 vty + Ol =) Delt) 
n Tj;ET k=1 
where the inequality follows by (7). Since 
1 ve 
weo pull) 2] Do we (Ps) (9) 
NM T;ET N T;ET 
and 
wo 2 Dew(T) 2 ew (Tip) (10) 
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then by (8), (9), and (10) we conclude 
ghz = Citi = Da): 


Hence, 


/ 


Cicer eT 
= + 


n’ 





and the theorem is proved. 

To show that this bound is best possible, we give several examples, 
which show that the bound can be attained (to within ¢) by varying 
any one of the four parameters L, un, <, or n. 


Example 6: L is varied. 


n=n, B=, <= <’ 
L= eye T2, yen ey yi ee Pont j Tr , Late Tey Ten—2) 
L/ = (71, Ln ds aRENES | Ton-2 , T2, T3, SENG Tas Poa) 
.T1/1 
T2/1 
oT, Aa/1 


GO<,u): .Tr/n - 1 


. Tri/n ==, 1 


. Ton—2/N caer 1 


. Ten—1/ n 
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Py 


P, 


Py 


P, 
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Thus, 


gat 
nN 


€ |e, 


which is the value of 1 + (n — 1)/n’ when n = n’. 
Example 6: u is decreased. 
n=n, 2 ps <a! 
| ne aes W/ Lirey eres Ey) 


(Ti) u'(T) 








T, 2e € 
sis Ve € 
G: 
V4 2e € 
Ts 2e 2e 
Tes 1 1 
Tn42 1 1 
Tonqe 
T on 1 1 
T on43 (In G, 
T on4i n— 1 n— l : T; < T; < Tend 
forl Sin 
Ton42 n—1 n— 1 : <j S 2n.) 
T3n 
Tsn n— I n— | 


1574 
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w 
N 
+ 
4 
ll 
3 
re we — 
et N 
eh Tl | 
a & & 
Ble Nis Ble 
a“ 7 
ej rm te &] 
BN BS BS 
el oa Al iw Tes 
BN 
BIA on] RESIN 
we |} T 
AY AY ay 


T'3n 
n— | 


Ton 


Pn 


2e 


Pi 


P» 


Py 





SHITVWONV ONISSHOOUdILTINWN 


GLST 
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Thus, 


which is arbitrarily close to 2 — (1/n) for e sufficiently small. We 
should note the interesting fact that w’ = 2n — 1+ ¢ for any list L’ 
which may be used. 


Example 7: < is relaxed. 
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Thus, 


, 


2n — l 
we 


which is arbitrarily close to 2 — (1/n) for e sufficiently small. 
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Example 8: n is varied. 
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Thus, 


wo n+t+n—-1te 


@ n' + 2e 
which is arbitrarily close to 1 + (mn — 1/n’) for e sufficiently small. 


Case 2:n > n’. The construction in this case is similar to that of Case 1 
and will not be presented. 

We should note that in Example 8 we took L = L’. If it is of some 
consolation to a possibly battered intuition, it should be noted that if 
nS n,n = pw, and < = <’ then for any L which is chosen, it is 
possible to choose a suitable L’ for which w’ S w. 


IiI, CONCLUDING REMARKS 


It should be pointed out here that we have not considered models of 
the multiprocessor system in which the priority list LD is “dynamically 
formed” (as opposed to the jived lists we have used thus far). For ex- 
ample, one seemingly quite reasonable way of doing this is as follows: 
At any time a processor is free, it immediately begins to execute the 
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“ready”? task (i.e., one which has all its predecessors completed) which 
currently heads the longest chain of unexecuted tasks (including itself). 
Suppose by following this algorithm in choosing tasks, we have a finish- 
ing time of w*. If we denote by w, the least possible finishing time 
(minimized over all lists), then we would like to assert something about 
the ratio w*/w.. It follows from what has been proved in this paper 
that wt/w, < 2 — (1/n), (where n is the number of processors) and 
we would hope that, in fact, we could show w*/w, is considerably closer 
to 1 than this. Unfortunately, this is not possible since it can be shown 
that the best possible bound on this ratio is given by 


w* SOx: 2 

Wo n+i1 

It is interesting to note, however, that in the case in which the partial- 
order < on the tasks is empty, then this bound can be improved* to 


wet 1 


rs Sem 





which, again, is best possible. 

In conclusion, one might ask just how ‘‘typical” the examples are for 
which w’/w, is close to the upper bound 2 — (1/n). While very little 
work has been done on this aspect, empirical results (using computer 
simulation (see Ref. 1)) indicate that examples in which w’/w, 2 1.1 
are quite common. 
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Phase and Amplitude Measurements of 
Coherent Optical Wavefronts 


By JOSEPH T. RUSCIO 
(Manuscript received March 24, 1966) 


A phase-locked laser loop has been used as an amplitude and phase meas- 
uring device for coherent optical wavefronts. A relative phase resolution on 
the order of one degree and an amplitude resolution accurate to one dB or 
better were obtained. The system and measuring techniques used are de- 
scribed, and the results obtained are illustrated by several examples. 


I. INTRODUCTION 


A laser phase-locked loop! consisting of two laser oscillators has been 
used to measure the relative phase and amplitude of the wavefront of a 
laser beam. A phase resolution on the order of one degree and an ampli- 
tude accuracy better than one dB have been obtained. This system was 
used to analyze the optical qualities of devices placed in the beam’s 
path by measuring their effect on the wavefront. The system and tech- 
niques used along with the results obtained are described and illustrated 
in this paper. 


II, DESCRIPTION OF THE MODIFIED PHASE-LOCKED LASER LOOP 


The phase-locked system is shown in Fig. 1. It consists of controlled 
and uncontrolled optical oscillators which are single-frequency helium- 
neon lasers operated at 6328A. Details of the oscillators’ characteristics 
are shown in Figs. 2 and 3. The beam waists and spot sizes are defined 
and calculated in Appendix A. The two lasers used initially are shown in 
Fig. 2; however, tube replacements required a different combination of 
mirrors to maintain a single transverse mode, so the final measurements 
were made with the lasers shown in Fig. 3. Results are identified with the 
lasers used. 

Prior to combining the two beams (I’ig. 1) on the surface of the photo- 
multiplier by means of a mirror and a beam splitter, the output beam of 
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Fig. 1— Phase-locked optical maser system. 


the controlled laser, which will be referred to as the reference beam, is 
collimated by a telescope. 


Ill. THEORY OF OPERATION 


The beam splitter in Fig. 1 provides two outputs: Port 1 to phase-lock 
the system and Port 2 for making the phase and amplitude measure- 
ments. The photomultipliers are square law detectors. Thus, if the field 
at the photosensitive surface is 


KH = HE, cos wet + Hy, cos wut 


where 

E, is the controlled oscillator field amplitude, 

E., the uncontrolled oscillator field amplitude, 
and 

We , W, the respective angular frequencies, then since #, and FE, have 
the same polarization, the resulting photocurrent is proportional to 


1D = of Oem ae EE cos (we = Wut =f ae ey) 
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which consists of a de term 3(Z, + £,”) plus the difference frequency 
term [l/.17,, cos (we — wz)Et]. 

In the original phase-lock loop,! the two lasers were locked at the same 
frequency with a de error voltage proportional to their phase difference. 
When the loop locks, the controlled laser tracks the frequency of the un- 
controlled laser in a manner such that the instantaneous phase error a 
remains smaller than 90° in absolute magnitude. A discussion of the phase 
relationships in the loop is given in Appendix B. 

To improve the measurement of phase and amplitude, the laser oscil- 
lators were phase-locked at a fixed frequency difference of 2 MHz by using 
an additional phase detector with a 2-MHz crystal-controlled oscillator 
as a reference. When the lasers are tuned so that their difference fre- 
quency is 2 MHz, the 2-MHz output from the photomultiplier is ampli- 
fied and applied to the phase detector. The phase detector output is a 
de error voltage proportional to the phase difference between the 2-MHz 
beatnote and 2-MHz reference signal (Appendix B). This error voltage 
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Fig. 2 — Initial lasers; (a) uncontrolled laser oscillator (signal), (b) controlled 
laser oscillator (reference). 
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is fed back through a differential amplifier to a piezoelectric disc trans- 
ducer. A mirror mounted on this transducer forms one end of the con- 
trolled laser cavity. There is an additional transducer-mounted mirror 
on the other end of the laser cavity; this is used for initial tuning (see 
Figs. 2(b) and 3(b)). The error voltage causes the cavity length and 
hence the frequency to change in a direction such that the phase error 
is decreased. When the loop locks, the controlled laser tracks the fre- 
quency of the uncontrolled laser in a manner such that the instantaneous 
phase error between the two 2-MUHz signals remains less than 90°. The 
loop tracks over a frequency range of +50 MHz, based on a feedback 
voltage of +80 volts and a piezoelectric transducer having a sensitivity 
of 0.6 MHz/volt. This means that the phase difference between the refer- 
ence signal and the beatnote signal remains less than 90° in absolute 
magnitude as long as the frequency of the uncontrolled laser does not 
vary more than +50 MHz. 


IV. TECHNIQUE OF MEASUREMENT 


Port 2 of the beam splitter provides an output which is utilized for 
phase and amplitude measurements; this permits scanning the com- 
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Fig. 4. — Beatnote (2 MHz) Lissajou; (a) 2-MHz Beatnote, (b) lissajou pattern 
indicating phase difference between two 2-MHz signals, (c) zero phase shift be- 
tween two 2-MHz signals. 


bined beams without interfering with the phase-locked loop. A circular 
collection aperture of a few mils diameter is used to scan the superim- 
posed wavefronts, selecting the ‘‘point”’ area detected by the photo- 
multiplier. The phase of the 2-MHz beatnote obtained from the photo- 
multiplier is dependent on the position of the “point” area on the 
wavefronts. Frequency selective circuits, including a 2-MHz tuned cir- 
cuit in the photomultiplier output and a crystal filter (2-MHz center 
frequency, 4-kHz bandwidth), assist in. maintaining a signal-to-noise 
ratio that is better than 40 dB. The result is a well-defined 2-MHz signal 
(Fig. 4(a)), which with the 2-MHz reference can be used to produce 
Lissajou patterns, as in Figs. 4(b) and 4(c), on an oscilloscope. By this 
means relative phase measurements between the two beams are possible. 
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Distortion of the pattern in Fig. 4(b) is due to limitations in the hori- 
zontal amplifier of the oscilloscope. Measurement of the beatnote ampli- 
tude as a function of the probe position from the beam axis is used to 
determine the relative amplitude of the wavefronts. The techniques and 
theory used for both phase and amplitude measurements will be de- 
scribed. 


V. PHASE MEASUREMENT 


The phase measurement is based on the fact that each of the two 
spherical wavefronts of radii R; and Re can be expressed approximately 
as 


Ey = exp (jkd?/2Ri) = exp (J), 
where 
®, => (kd?/2R,), 


dis the distance from the beam axis, and 


k = 2r/r. 
Ey = exp (jkd?/2R2) + y = exp (j2) + 7, 
where 


b, = (kd?/2R) 


and y is the phase difference between the two wavefronts on the axis. 
Thus, the phase difference between the two wavefronts is given by 


A® = , — & + y = (kd?/2) (1/R2 — 1/Ri) + v. 
The telescope reduces the divergence of the reference beam so that its 
radius of curvature,* R,, can be considered infinitely large, therefore, 
©, ~0 
and 
A® & (kd?/2R2) + ¥. 
From Fig. 5, it can be seen that if the phase shift as indicated by the 
Lissajou patten is adjusted to be zero at the center of the beam (by means 


of an auxiliary phase shifter), all measurements can be made relevative 
to this reference and the relative phase shift becomes 


A® & (kd?/2R2). 


* Calculations for the radii of curvature involved using the telescope are given 
in Appendix C. 
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Fig. 5— Phase relationship between spherical and planar wavefronts. 


Measurement of A® as a function of distance (d) from the beam axis 
provides a means of determining R, . 


The experimental layout for measuring the relative phase between the 


two beams is shown in Fig. 1. The movable collection aperture positioned 
directly in front of the photomultiplier can be moved in 5-mil increments 
along the horizontal or vertical axis. Changes in phase with position is 
plotted as shown in Figs. 6(a), (b) and 7(a), (b). 
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Fig. 6 — Optical wavefronts (laser in Fig. 2(a)). 
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Fig. 7 — Optical wavefronts (laser in Fig. 3(a)). 


The curves in Figs. 6(a) and (b) are for the laser shown in Fig. 2(a). 
Measurement of the optical wavefront was made at a distance of 3 meters 
from the apparent beam waist using a 0.015-inch diameter collection 
aperture. Similar data for the laser in Fig. 3(a) are shown in Figs. 7(a) 
and (b) in which case the collection aperture diameter was 0.009 inch. 
Theoretical curves indicate radii of curvature less than the measured 
values for all cases. The radius of curvature at a distance z from the ap- 
parent beam waist location is given by? 


rf +@)] 


where 29 = wwo?/\, Wo being the beam-waist radius, and \ = 0.6328n. 

The disagreement between the theoretical and experimental values 
has not been resolved. This problem remains under consideration as work 
continues in this area. 


VI. FIELD AMPLITUDE MEASUREMENTS 


In a similar manner, observing the 2-MHz beatnote amplitude as a 
function of probe distance perpendicular to the beam axis provides a 
means of determining the field amplitude distribution of the combined 
beams. With the reference beam enlarged, in this case to 30 times its 
initial size, the amplitude of the reference beam over the distance 
scanned is relatively constant; therefore, the amplitude distribution 
measured is, in fact, the relative amplitude of the signal beam. The ac- 
curacy is governed by the variation in intensity of the reference beam 
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over the area scanned as shown in Fig. 8. With a beam reference spot 
size of 1-inch diameter, scanning a distance of 0.2a (a = beam radius) 
from the beam axis introduces an error of 0.4 dB in the relative measure- 
ments. 

Examples of the measured amplitude distribution as a function of 
distance from the center of the beam are shown in Figs. 9(a), (b) and 
10(a), (b). In Fig. 9(b), which applies to the laser shown in Fig. 2(a), 
the theoretical Gaussian curve (exp (—7r?/a?), a being the beam radius 
where the field amplitude falls to 1/e, and r the distance from the beam 
axis) agrees quite closely with the measured values. Figs. 10(a) and 
(b) show amplitude distribution curves for the laser in Fig. 3(a); in 
these an unexplained lack of symmetry appears. 


VII. DETERMINATION OF PROPERTIES OF OPTICAL ELEMENTS 


In addition to measuring the signal laser’s optical wavefront, it was 
also possible to determine the effects of putting a lens in the signal beam. 
Results of this experiment are described. 

To facilitate measurement of lenses, the signal beam was also colli- 
mated so that now both beam wavefronts were planar. Under these 
conditions, placing a glass lens in the signal laser beam produced a 
phase-front at the collection aperture dependent on the focal length of 
the lens. The experimental arrangement is shown in Tig. 11 and the 
results of measurements on a 86.6-cm focal length Jens are shown in 
Trig. 12. The measurements agree quite closely with the theoretical 
values. 
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Fig. 8— Intensity distribution of Gaussian curve. 
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Fig. 9 — Optical wavefront—amplitude (laser in Fig. 2(a), 0.009-inch collection 
aperture). 


VIII. POSSIBLE IMPROVEMENTS 


In Fig. 11 it can be seen that the lens being tested is common to both 
the phase-lock loop branch and the phase and amplitude measuring 
system. To phase-lock the loop, the two beams must be made coincident 
at the photomultiplier. A fixed device, such as glass lens, can be inserted 
in the system, the beams aligned and the loop locked. However, if the 
item under test introduces random variations which displace the beams 
relative to cach other, the phase-lock loop is affected and meaningful 
measurements are not possible. To eliminate this problem, the setup 
shown in Fig. 13 is preferable. Under these conditions, the phase-locked 
loop is independent of the component under test and is therefore, not 
affected by any instability introduced. This method may require lasers 
with greater output powers because of the additional beam splitters re- 
quired. 
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Fig. 10— Optical wavefront—amplitude (laser in Fig. 3(a), 0.009-inch collec- 
tion aperture). 
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Fig. 11— Experimental arrangement. 


The system as it stands is sensitive to acoustic noises and for accurate 
phase measurements the laser must be maintained within a vault. 
Enclosure in the vault permits phase-lock to be maintained for periods 
of 2 or 3 hours, with occasional tuning adjustments of the laser by means 
of the transducer-mounted mirror. Under these conditions the phase- 
lock is sufficiently stable to permit measurements without too much 
difficulty; however, it would be desirable to have portable lasers that 
could be used under less ideal conditions than a closed vault. Use of a 
transducer with a higher resonant frequency and additional gain in the 
feedback loop should increase the phase-lock stability. 
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APPENDIX A 


Calculation of Beam Waists and Spot Sizes 


The following notations, some of which have already appeared, will 
apply in the following development: 
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w = spot radius, defined as the radius at which the field ampli- 
tude falls to 1/e of its maximum value on the z-axis. 
wo = beam waist, which is the minimum spot radius. 

wW1, We = spot radii at their respective mirrors. 

R,, R, = radii of curvature of the two laser mirrors. One of the refer- 
ences‘ uses 6; and be as the notation for the radii of cur- 
vature of the mirrors. 

d = separation of two laser mirrors. 
d, , dz = distances to mirrors as shown in Figs. 2 and 3. 
\ = wavelength = 6328A. 
The beam waist wo is given by the following?: 


wo = \V/d(Ri — d)(R, — d)(Ri + R, — d) (2) 
a(R, + Re = 2d) ; 


Output spot sizes were calculated using* 


2 
Wi = Ri Re —ad 
and 
2 [rA\ Ried 
(wiwe) = (*) i he (4) 
Locations of the beam waists were obtained from! 
_ (dR2 — d) 
a ee OF (5) 
and 
_ (dk; = d) 
ee eo (6) 


To compute the apparent beam waist location, it is necessary to first 
correct for the negative lens effect of the output mirror.’ The output 
mirror acts like a negative lens transforming the phase front of the light 
wave emerging from the mirror. A mirror with a radius of curvature 
R and an index of refraction n transforms the phase front so that the 
radius of curvature is R/n. In this ease (Fig. 2(a)), R = 2m, n = 1.46 
(quartz) so that the new radius of curvature 


‘= R/n = 2/146 = 1.37 m. 
With this radius of curvature, using® 
R’ 
a ay 7) 
1+ ( ) 


T We? 
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the apparent beam waist appears to be at distance z = 23.8 cm from 
the output mirror; this places the apparent beam waist 6 cm outside the 
laser as shown in Fig. 2(a). Similar computations produce the apparent 
beam waist location for the other lasers as indicated in their respective 
figures. The radius of the apparent beam waist is obtained using the 
value of R2/n rather than Re in (2). 


APPENDIX B 


Phase Relationships of Phase-Locked Loop 


It has been shown when the field at the photomultiplier is # = 
E, cos wet + HL, coswt that the difference frequency term is /.L, 
COS (we — wy)t where w, and w, are the respective angular frequencies of 
the controlled and uncontrolled laser beams. To determine the phase 
relationships in the de system, the field amplitudes are omitted and the 
angular frequencies and their phases are expressed as 


cos [(wet + ¢1) — (wut + ¢2)] = cos [(we — wut + ¢1 — gol. 
Let 
(we — wu.) = Aw and gi — v2 = Ay 
then the error signal from the photomultiplier is 
cos (Aw-t + Ag). 


When the system is phase-locked, the frequency difference Aw-¢ is equal 
to zero, therefore, 


cos (Aw-t + Ay) = cos Ag. 


This, in turn, can be written as sin (90° — Ag); the error signal for the 
phase-lock laser loop. At phase-lock, this error voltage approaches zero 
and Ag, the phase difference, is equal to 90° + a, where a is the instan- 
taneous phase error of the loop. The output of the phase detector is 
proportional to sin a and since a is small, sina = a, the de error voltage 
is proportional to the phase difference. 

When the system was modified to permit the use of a 2-MHz inter- 
mediate frequency, an additional phase detector was utilized to develop 
an error signal based on the output difference frequency term of the 
photomultiplier and a 2-MHz reference signal. Computations similar to 
the above show that the output error voltage of the IF phase detector is 
also proportional to the phase difference between the 2-MHz beat 
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frequency from the photomultiplier and the reference frequency of 
2-MHz. 


APPENDIX C 


Beam Transformation Using A Telescope 


The location of the output beam waist after passing through a tele- 
scope consisting of two lenses with focal lengths fi and fz, spaced at a 
distance d = fi + fe + d, is as follows,? where it is assumed the telescope 
is adjusted so that the misadjustment Ad is approximately equal to zero. 


= -|(s-wE+a 


where S; is the distance from the input beam waist to the first lens and 
S2 is the distance to the output beam waist from the output lens. Since 
we know S;:, substitution of the values for the reference telescope 
[f: = 1 cm and fo = 30 cm] gives us a value of 270 m for S:. 

The radius of curvature of the wavefront is? 


nef] 


where 29 = mWo?/d and z is the distance from the output beam waist to 
the photomultiplier [in this case z = Sz — 3m = 267 ml]. Thus, R, is 
determined to be approximately 2000 m which for our purpose is con- 
sidered to be a planar phase front. Since there is a good possibility that 
the lens arrangement is not well adjusted, it is wise to observe the out- 
put beam of the telescope over an extended distance to insure that 
there is no noticeable divergence or convergence of the beam diameter. 
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State of the Art in GaP 


Electroluminescent Junctions* 


By M. GERSHENZON 


(Manuscript received June 10, 1966) 


Quantum efficiencies and brightness values for green and particularly 
for red light emission from currently available GaP p-n junctions in 
forward bias at room temperature are sufficiently high to merit con- 
sideration in electroluminescence applications where the human eye is 
the detector. 


I. INTRODUCTION 


Although the recombination radiation from forward-biased GaP 
p-n junctions could be used for the same applications as the emission 
from lower band-gap materials (e.g., in photon-coupled circuitry), 
the GaP emission occurs mainly in the visible portion of the spectrum 
and is thus more appropriate for applications where the human eye is 
the detector. To obtain emission in the visible from a forward-biased 
p-n junction, one needs a semiconductor with a band gap greater than 
1.8 eV. The II-VI compounds that meet this requirement cannot be 
made into simple p-n junctions (although some of their alloys can). 
Hence, only GaP, BP and the various polytypes of SiC are considered. 
(These are all indirect gap semiconductors so that stimulated emis- 
sion is not normally expected.) Of these three, GaP (band gap 2.26 
eV) is characterized by the simplest materials technology. 


II. RADIATIVE RECOMBINATION MECHANISMS 


Fig. 1 shows a typical room temperature forward-bias emission 
spectrum from a diode prepared by Zn diffusion into an n-type crystal 
containing Te and O. Two emission bands appear in the visible, sepa- 
rated both spectrally and spatially. A weak green band is generated 

* Presented at the Seminar on Electroluminescence and Semiconductor Lasers 


sponsored by the New York Section IEEE at Stevens Institute of Technology, 
May 11, 1966. 
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Fig. 1— Emission spectrum from a forward biased Zn-diffused diode at room 
temperature. 


close to the junction proper, while a much stronger red band seems to 
originate on the p-side of the junction. Infrared emission seen in Fig. 1 
will not be discussed. 


2.1 The Red Emission 


Mostly by comparison with photo-luminescence, it has been shown 
that the red band is due to donor-acceptor pair recombination in- 
volving shallow Zn acceptors and deep O donors.! External photo- 
luminescence quantum efficiencies of up to 11 percent have been 
reported at room temperature in p-type samples.? Zn-O pair band re- 
combination in a p-n junction is sketched in Fig. 2. On the p-side of 
the junction the Zn level (N4 ~ 2 X 10!8 em~8) is about half full of 
holes in thermal equilibrium. Injected minority carrier electrons are 
captured efficiently by the ionized compensating O donors and, because 
the O donor is relatively deep, the electrons remain trapped, with little 
thermal ionization back to the conduction band, until they recombine 
radiatively with holes on the Zn acceptors. This situation is identical 
to the Zn-O pair emission in photoluminescence and should lead to 
high efficiencies. On the n-side of the junction, the O donors are always 
filled with electrons. Injected minority carrier holes may be captured 
by the empty Zn acceptors, but, because the Zn acceptor level is quite 
shallow, they are thermally released back to the valence band, from 
where they may find other means to recombine. In the space-charge 
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Fig. 2— The Zn-O pair band mechanism in a forward-biased p-n junction. 


layer, the O donors are below the electron quasi-Fermi level from the n- 
side to deep into the depletion layer, and these donor states can be 
populated by electrons. However, the Zn acceptors lie above the hole 
quasi-Fermi level only very close to the p-side. It is only these ac- 
ceptors that contain trapped holes. Hence, there is no region in the 
space-charge layer that contains both trapped electrons and trapped 
holes, and therefore, the Zn-O pair band should not be an efficient re- 
combination mechanism in the depletion layer. Thus, we expect the 
red Zn-O band to originate predominantly from the p-side, beyond the 
space-charge layer. 


2.2 The Green Emission 


Among the many types of recombination leading to photolumines- 
cence near the band edge at low temperatures there are (2) pair transi- 
tions involving a shallow donor and a shallow acceptor (e.g., Te and 
Zn) ,?»* and (i) the “A” line and its phonon replicas due to exciton 
recombination at an N atom substituting isoelectronically for a P 
atom.>® The green emission at low temperatures observed from junc- 
tions prepared from such material can be identified as due to these 
transitions by simply comparing electroluminescence and photolumi- 
nescence spectra.” As the temperature of the diodes is increased, the 
pair band becomes weaker and the “A” line grows at first but then 
also diminishes in intensity and, at the same time, it broadens and 
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merges with its phonon replicas. Above approximately 200°K, only a 
broad green emission band remains. It is, therefore, not clear whether 
the room temperature green band is due to isoelectronic N traps, or 
to shallow pairs, or to some new mechanism. The possibility of simple 
band-to-band recombination can be eliminated because the observed 
efficiency is several orders of magnitude greater than the efficiency 
(on the n-side, on the p-side, and in the space-charge layer) calcu- 
lated using the band-to-band rate constant derived from a detailed- 
balance analysis of the absorption edge. 


Ill. INJECTION ——- RECOMBINATION KINETICS 


3.1 Dominant Current 


The current-voltage characteristics of Zn-diffused diodes can be ex- 
plained quantitatively by assuming that there are several current 
generating mechanisms, each of which dominates in a different range 
of forward bias.1 These mechanisms are summarized in Table I. We 
assume that the current J can always be written as exp qV/nkT, 
where n will depend upon bias. At the lowest applied bias, surface leak- 
age predominates and the effective n (at room temperature) is about 
four. In the next bias range the dominant current is due to recombina- 
tion at deep levels in the space-charge layer. Here n = 2, but with in- 
creasing bias, preexponential terms (W is the junction width and Vp 
the built-in potential) cause the effective value of 7 to decrease. In the 
next bias range, not observed in all diodes, recombination at a shallow 
level in the space-charge layer dominates. Although n is nominally 
equal to unity here, again pre-exponential terms perturb its value 
somewhat. Here the effective n lies between one and two and slowly 
decreases toward unity with increasing bias. Thus, in the space-charge 
regime, n starts at two, and approaches one at high bias. Finally, at 
the highest biases, simple injection beyond the depletion layer domi- 
nates with n = 1. (Conductivity modulation which should set in at 
even higher biases has so far not been observed.) 


3.2 Red Emission 


We have already noted that the red Zn-O emission seems to origi- 
nate from the p-side of the junction. From near-field spatial distribu- 
tions on a surface cleaved perpendicular to the junction plane it is 
evident that the green emission, at least at high biases, is centered at 
the junction itself, as defined by observations of the junction electro- 
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Tasite I] — DEPENDENCE OF CURRENT (J) AND oF Zn-O 
Pair Banp Emission (Z) upon Bras (V) AND 
THEIR COMPARISONS 


Dominant Current, J 


J a exp 





Space Charge Recombination Diffusion Current on 2-Side 
Surface Leakage ix 
Deep Levels Shallow Linear Range Conductivity 
WwW qv qV 
exp BV Ja >—s exp = Ja exp 7; 
YECXDE 273, V oP oF PEP 
B = Qg/4kT 
nz4 n=s2 n—l n=l n= 2 


a. oo 
| eee ee ee: 


V me m = 2 





m=1 ip m=2 
L a exp Laexp 25, n+1 
Linear Range Diffusion Diffusion and Conductivity 
Drift Modulation 
Saturation 





Light Emission from p-Side, L 
qV 
La exp mkt 





optic effect. However, the red emission is not centered on the junction 
but lies on the p-side. At high biases the emission closest to the junc- 
tion saturates and the emission volume simultaneously expands deeper 
into the p-side. This observation is inconsistent with n-side or space- 
charge layer recombination.! Thus, the red emission is generated on 
the p-side beyond the space-charge layer, as expected from Fig. 2. 
The spatial motion at high biases is due to the saturation of the recom- 
bination centers on the p-side.® The injected carriers, therefore, must 
travel beyond the normal diffusion length in order to recombine. We 
again assume that we can write the red light intensity Z in the form 
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exp gV/mkT. At low bias, with simple injection into the p-side and 
recombination at Zn-O.pairs, m = 1. However, in the saturation range 
at high bias, with minority carrier transport limited by diffusion only, 
m = 28 


3.3 Green Emission 


It is an experimental result that m is always equal to unity for the 
green emission, independent of the bias. 


3.4 Light versus Current 


In Table I the J-V and L-V data (for the red band) are combined 
to show the dependence of light intensity upon current. At low bias, 
where surface leakage predominates, the light emission varies as ~ Jt. 
In the space-charge regime the relationship is quadratic, but it ap- 
proaches linearity at high bias. At the highest biases, with saturation 
on the p-side, and with the current due to injection beyond the space- 
charge layer (hence, into the n-side), the relationship becomes sub- 
linear. Thus, the quantum efficiency of the Zn-O red band increases 
rapidly at first, then slowly levels off and finally decreases at the 
highest biases, thus exhibiting a maximum in the linear range. For 
the green emission m = 1 always. Hence, the quantum efficiency rises 
rapidly, then slowly levels off and remains constant up to the highest 
biases measured. 


IV. DIODE STRUCTURES 


The various types of GaP p-n junctions that have been reported in 
the literature are summarized in Table II. The circled structures were 
prepared expressly to exhibit the Zn-O red band. A typical in-diffused 
diode is made by diffusing Zn into an n-type crystal containing Te 
and O. A typical out-diffused diode is made by heating a p-type crys- 
tal doped with Zn, Te, and O, so that some Zn diffuses out, leaving an 
n-layer near the surface. Grown junctions may be prepared by floating- 
zone,®1° by vapor phase epitaxy!®1! or by solution-growth epitaxy.12)8 
For example, in the latter case one can grow (by “tipping”) an n-type 
Te-doped layer from solution onto a p-type seed containing Zn + 
O. A typical alloyed diode is made by alloying a Sn ball onto a p- 
type sample containing Zn + 0.141516 Finally, surface structures may 
be prepared by evaporating a metal on a cold p-type substrate con- 
taining Zn, Te, and O.16 

The diffused structures and the grown junctions are simple p-n 


TABLE II — GaP Drops (cIRcLED) DESIGNED To Exuipit THE ZN-O Rep Parr Emission 


GaP Junctions 





Diffused Grown Alloyed Surface 


In-Diffused Out-Diffused 


Method Doping Substrate Alloy Substrate Film 
Substrate Diffusant Substrate 


n Gn) | p Float-zone Mg-S p Gn) p 
(Te + O) Cd (Zn + Te + O) Cd-S (Zn + O) Ag-Te (Zn + Te + O) Sn 














(Cd + Te + O) | Vapor Mg-S n In-Zn Ag paste 
p Si Cd-S Au-Zn 
| Solution (Zn + O)-Te Ag-Zn 
p-n p-n p-n p-n Tunneling 
Surface-barrier, tun- 
neling 
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junctions, where injection is due to thermal activation over the normal 
junction barrier.’ In the surface diodes injection arises from tunneling 
through a thin surface layer.1® Three injection mechanisms can occur 
in parallel in the alloyed diodes.1® In the regions where Sn alloying 
produces an n-type regrowth layer on the p-type substrate, a simple 
p-n junction is formed. In regions where no n-type regrowth layer is 
produced, the metal is in intimate contact with the p-type substrate. 
This is a surface barrier junction which at forward bias can only ex- 
tract majority carrier holes. Since it cannot inject minority carriers it 
results in an excess nonradiative current component. In regions where 
a thin layer of insulator (perhaps an oxide) separates the metal from 
the substrate, it is also possible to inject minority carriers by tunnel- 
ing. 


V. RADIATIVE EFFICIENCIES 


5.1 Quantum Efficiency 


Table III summarizes the maximum reported external quantum effi- 
ciencies of the red Zn-O band at room temperature in the five classes 
of diodes described previously. Note that while the highest measured 
efficiency, 1.5 percent, corresponds to an alloyed diode,!* the maximum 
efficiencies observed in the other four classes are all within less than a 
factor of ten of this value. The table also lists some “average” effi- 
ciencies,!*:14 which is the range obtainable with high yields with pres- 
ent technology. The highest quantum efficiency reported for the green 
emission is 0.015 percent,!® or 100 times less than the corresponding 
figure for the red. Since the external quantum efficiencies for spon- 
taneous emission in GaAs diodes at room temperature are usually one 
to five percent, it is obvious that the red emission from GaP, only 
slightly less efficient, might be useful in applications where spontane- 


TasLe Il — ExternaL QUANTUM EFFICIENCIES OF Zn-O 
Parr BAND IN GaP DiopEs at Room TEMPERATURE 


aac Average Source 
In-diffused Zn/Te + O 0.2 BTL 
Out-diffused Zn + Te+0O 0.7 BTL 
Solution-grown Te/Zn + O 0.75 0.3-0.5 TBM 
Alloyed Sn/Zn + O 1.5 Philips 
0.01-0.1 SERL 
Surface Au/Zn + Te + O 0.4 BTL 
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ous GaAs emitters are considered, as in optoelectronic devices. How- 
ever, the significant distinction is that the GaP emission lies in the 
visible range. 


5.2 Luminous Efficiency 


By integrating the product of the emission spectrum of the Zn-O 
red band and the visual acuity curve, it is found that one watt of 
Zn-O red light is equivalent to 20 Lumens as far as the eye is con- 
cerned. The GaP green emission corresponds to about 650 Lumens/ 
watt. (For comparison, the emission from GaP,Asi_,, where x cor- 
responds to the maximum P concentration before the band structure 
becomes indirect, is equivalent to approximately 100 Lumens/watt.) 
Consider a typical diode, available with current technology, as sum- 
marized in Table IV. It may operate at 20 mA with a de bias of 2 
volts emitting red light with an external quantum efficiency of 0.5 per- 
cent. (Since the energy of the emitted photon is 1.77 eV, the power 
efficiency is only slightly less than the quantum efficiency.) With a 
junction area of 10—° cm? the current density is 20 amps/cm?, which is 
close to the maximum in quantum efficiency. Table IV also notes the 
output in normal power units as well as in luminous units. By assum- 
ing that the light leaves the diode from only one surface in the active 
junction area of 10—*% em?, the predicted brightness is 3600 foot-Lam- 
berts. (Although present measurements are about a factor of ten lower, 
the discrepancy might be decreased by using large ratios of active 
junction area to inactive surfaces, or by using special geometries 
or index-of-refraction-matching glasses to increase the light output 
from a given region.) The maximum reported efficiency for the green 
emission is only 0.015 percent but the luminous equivalent of the 


TaBLE IV — Tyricat Luminous EFFICIENCY FOR THE 
Zn-O Rep Banp (20 LumEns/watTt) 





Typical diode: 20 mA 
2V 


0.5% external quantum efficiency 
10-3 cm? area 


Output: 2 X 10 watts 
4 X 10-3 Lumens 
3 X 107 candles 


Brightness: 4 lamberts 
3600 foot-Lamberts 
1200 candles/square foot 
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green band is more than 30 times greater than that for the red band. 
Thus, the brightness currently available from the best green diodes 
are approximately equal to that available from current average red 
diodes at biases where the red efficiency is a maximum. (At higher 
biases the green emission of course will increase more rapidly than 
the red emission.) For comparison, the brightness of the green emission 
from a ZnS:Cu electroluminescent cell is about 1 foot-Lambert (at 
60 Hz, and up to 10 foot-Lamberts at higher excitation frequencies, 
but with significant deterioration during aging). Thus, with present 
technology, the red emission from GaP diodes corresponds to much 
higher brightness values than for standard ZnS EL cells, and this oc- 
curs in the bias range corresponding to maximum efficiencies — 0.3 to 
0.5 percent quantum efficiency. Similarly, the brightness of the GaP 
green emission is also higher than that available from ZnS panels. 


5.3 Efficiency Outlook 


Since the quantum efficiency of the Zn-O red band is as high as 11 
percent in photoluminescence of p-type samples, it might be possible 
to obtain similar electroluminescence efficiencies from p-side injection 
in junctions. At low to moderate biases the dominant competing mech- 
anism is due to space-charge recombination at deep levels. Thus, a 
reduction of this current component could increase the red emission 
efficiency. Since the room temperature green emission mechanism has 
not been established, similar predictions for the green band cannot be 
made. Finally, it is noted that a number of other deep pair combina- 
tions exhibit donor-acceptor pair recombination in the orange and red 
in photoluminescence at room temperature with efficiencies of several 
percent.19?° These may eventually provide useful recombination cen- 
ters in GaP diodes. 


VI. SUMMARY 


Currently available p-n junctions in GaP emit in the red with an 
external quantum efficiency (roughly equal to a power efficiency) 
which exhibits a maximum (with bias) of 0.1 to 0.5 percent. The 
brightness at this maximum is within a factor of 10 of 3600 foot-Lam- 
berts, far greater than that from a normal ZnS EL cell. The best green 
emitting diodes available correspond to a similar brightness value, but 
the efficiency does not drop with increasing bias. Such diodes should 
possess the normal advantages of semiconductor devices: low de 
operating bias, small size, probably cheap to manufacture and hope- 
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fully little deterioration with aging. Special diode geometries or the 
use of index-of-refraction-matching glasses might be used to increase 
the external quantum efficiency (although the red band falls in a re- 
gion of low internal absorption, the green band, near the band edge, 
does not) or to focus the emitted light, thereby increasing the apparent 
brightness. 
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Schottky Barrier Photodiodes with 


Antireflection Coating 


By M. V. SCHNEIDER 
(Manuscript received July 19, 1966) 


Schottky barrier diodes can be used for fast and efficient photodetectors 
af the incident light 1s coupled into the depletion layer of the diode and if 
electron-hole pairs are created by the internal photoelectric effect in the 
depletion layer. Fast response of the diode 1s achieved by designing a Schotiky 
barrier with a small RC product. High quantum efficiency is obtained by 
coupling the light through a thin metal layer into the depletion region of 
the diode and by using an antireflection coating on the metal layer for 
matching the incident light beam. . 

Schottky barrier photodiodes have been made with thin semttransparent 
gold layers on n-type epitaxial silicon and with zinc sulfide as an antire- 
flection coating. A net quantum efficiency of 70 percent has been achieved 
at the He-Ne laser wavelength of 6328 A. The pulse response of packaged 
diodes with 0.5-nanosecond wide pulses shows a symmetrical pulse shape 
with only small distortion due to carrier diffusion and reactance in the 
completed package. 

The diode structure ts suttable for detector arrays. It 1s also useful for 
optical time domain reflectometry. The technique of coupling light through 
metal layers can be extended to other optical devices which require efficient 
transfer of radiation into a semiconductor through conducting electrodes. 


I. DEFINITION OF THE SCHOTTKY BARRIER PHOTODIODE 


A Schottky barrier is a rectifying metal-to-semiconductor contact with 
certain properties which have been originally described by Schottky.) 
The main feature of the Schottky barrier is that it has the properties 
of an ideal step junction and that only majority carriers are involved 
in the rectification process. Schottky barriers have been used for various 
devices in the microwave region. A few examples are the Au-n-type 
GaAs and the Au-n-type Si varactors described by Kahng and D’Asaro?4 
and the honeycomb millimeter diode by Irvin and Young.® The Schottky 
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barrier has not been used to any great extent for optical devices because 
of the difficult problem of coupling optical radiation through the metal 
contact into the semiconductor. The purpose of this paper is to present 
a solution to this problem and to describe the properties of a completed 
diode which will be defined as a Schottky barrier photodiode. 

The Schottky barrier photodiode is a rectifying metal-to-semicon- 
ductor contact in which electron-hole pairs are created in the semicon- 
ductor by the internal photoelectric effect under incident illumination. 
The separation of the pairs is accomplished by the built-in electric field 
in the barrier or by an externally applied field across the barrier. The 
separation of the carriers leads to a photocurrent in the external circuit 
which may be amplified and detected. Internal amplification by ava- 
lanche multiplication cannot be achieved in a Schottky barrier photo- 
diode because of nonuniform field intensities at the boundary of the 
metal-semiconductor interface. 


II. STRUCTURE OF OPTICAL JUNCTION DETECTORS 


Photodetectors with a high frequency response consist usually of a 
semiconductor p-n junction or a semiconductor p-i-n structure. A sche- 
matic drawing of such a detector is shown in Fig. 1. The incident radia- 
tion is absorbed in the intrinsic layer which is sandwiched between a 
p and an n-layer. Electron-hole pairs are created by the internal photo- 
electric effect and are separated by an applied electric field across the 
junction. Metal contacts are required on both sides of the structure in 
order to apply the electric field and to collect the carriers. The contacts 
on top of the p-layer shown in Fig. 1 are made in the form of stripes in 
order to transmit the incident radiation between adjacent stripes into 
the p-i-n region. 
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Fig. 1 — p-i-n photodiode with contact stripes on p-layer. 
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High quantum efficiency and fast response are achieved by proper 
choice of the semiconductor material and the physical dimensions of 
the layers including the contact stripes. Design criteria have been 
discussed by Anderson,® Lucovsky and Emmons,’ and Riesz.2 Internal 
multiplication with uniform and microplasma-free junctions with a 
guard ring has been achieved by Anderson, MeMullin, D’Asaro and 
Goetzberger® and by Melchior and Lynch.!° 

A different approach is necessary for the case of a Schottky barrier 
photodiode shown in Tig. 2. A semitransparent metal is deposited on 
the surface of a semiconductor in order to create a surface barrier. The 
light is matched into the barrier by an antireflection coating which is 
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ene 2— Schottky barrier photodiode with antireflection coating on metal 
m. 


deposited on the semitransparent metal. A thick metal dot or a metal 
ring with a contact wire serves as an external contact for applying the 
de back bias and for collecting carriers. The semiconductor material 
and the applied back bias are chosen in such a way that most of the 
carriers are created within the depletion layer. The net quantum effi- 
ciency of the device is determined mainly by the transmission loss in 
the metal film and by the quality of the antireflection coating. The 
response time is determined by the transit time of the carriers through 
the depletion layer and the RC product of the diode. Design criteria for 
achieving a small RC product will be discussed later in this paper. 

Coupling of the incident light beam into the Schottky barrier can 
also be achieved by other means. Fig. 3 is a cross-sectional view of the 
Sharpless photodiode." A point contact is formed on epitaxial mate- 
rial and the light is focused into the Schottky barrier through an etched 
dome in the semiconductor. An antireflection coating is not necessarily 
required because the semiconductor surface does not present a serious 
optical mismatch to the incident wave. 
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Fig. 8 — Point contact Schottky barrier photodiode with etched dome in epi- 
taxial semiconductor. 


Another way to build a Schottky barrier photodiode is shown in Fig. 
4, A thick metal coating is applied to the semiconducting material. 
An array of slots or holes are etched into the metal. The holes are close 
to resonance at the wavelength of the incident radiation, e.g., they are 
approximately a quarter wavelength wide and are spaced approximately 
a half wavelength apart. The thickness of the metal has to be much 
smaller than a wavelength because the excited mode in the hole or the 
slot is under cutoff. The remaining reactive part of the surface impedance 
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Fig. 4 — Photodiode or photodetector with metallic surface reactance sheet 
and antireflection coating. 
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of this structure is compensated by a suitable antireflection coating. 
This coating is only required for improving the final match of the device 
because a reactance sheet can be designed with a high return loss without 
any further matching elements. 

Photodiodes of the type shown in Tig. 2 have been made on n-type 
epitaxial silicon for maximum response at the 6328 A line of a He-Ne 
gas laser. Gold has been used for formation of the Schottky barrier and 
zine sulfide for the antireflection coating. The results are discussed in 
the following sections of this paper. Various technological improve- 
ments in the technique of fabricating microarrays will have to be 
achieved in order to fabricate the photodiode shown in Fig. 4. 


III. TRANSMISSION OF LIGHT THROUGH METAL FILMS 


Metal films are characterized by high reflectivity and low transmission 
in the visible range of the spectrum. The transmission through the film 
can be increased by reducing its thickness. The reflection can be de- 
creased by a dielectric film acting as a quarter wave transformer on top 
of the metal film. These two simple steps make it possible to transmit 
light into an optical device which requires metal electrodes. 

Optical constants of thin metal films are listed by Schopper," Heav- 
ens, and Mayer.!5 The physical theory and measurements are described 
by Parker Givens'* and by Abelés.” A marked dependence of the optical 
constants on film thickness is observed. Other parameters of importance 
include the technique used in the deposition process, the substrate 
temperature, deposition rate and surface properties of the substrate. 
A typical example of steps taken in substrate preparation, purity of 
materials and pressures observed in the vacuum chamber is described 
by Bennett and Ashley'® and a review on nucleation and film growth 
as a function of various parameters is given in a paper by Behrndt.!® 
What complicates matters for device applications is the fact that the 
films may not be continuous and that the index of refraction depends 
on the angle of incidence as described by Hall.?° 

These difficulties do not prevent the fabrication of an antireflecting 
metal-semiconductor surface. Fairly consistent results can be achieved 
with gold evaporated from tungsten or molybdenum boats under high 
vacuum or ultra-high vacuum conditions. The optical surface impedance 
of the structure can be measured and from this one can determine a 
unique dielectric constant and a unique thickness which will allow optical 
matching at a specified wavelength. The steps in this procedure are 
similar to matching microwave networks by using the Smith Chart. 
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The reflectance and the transmittance which one can expect from a 
thin gold film at \ = 6328 A are shown in Fig. 5. Reflectance, transmit- 
tance and loss are plotted as a function of film thickness for an unsup- 
ported Au film with an index of refraction N = n — j-k = 0.30 — 7-3.0. 
The only assumptions used in this plot are that one deals with normal 
incidence and that this particular index of refraction is independent of 
the film thickness. The index N = 0.30 — j-3.0 is an approximate 
value for bulk gold obtained from measurements described by Parker 
Givens.!® Other data for gold deposited under various conditions and 
listed by Schopper® cover an approximate range of n = 0.30 + 0.10 
and k = 3.0 + 1.0 at wavelengths in the range from 6000 A to 6600 A. 

The exact thickness of a thin metal film is usually of secondary im- 
portance for many device applications. What one needs to know for 
devices described in this paper is reflectance, transmittance, and loss 
for a specified surface resistance (sheet resistance) of the film. The 
surface resistance of the film limits the frequency response of the device 
because it will contribute to the resistive part of the device. The rela- 
tionship between the surface resistance and the RC product will be 
discussed later. 

Fig. 6 is a plot of reflectance, transmittance and loss measured for 
Au films at ) = 6328 A for surface resistances in the 3 to 6 ohm/square 
range. The Au films are deposited on fused quartz slides under the 
following conditions: 

(z) The substrates are cleaned ultrasonically in successive baths 
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Fig. 5-— Transmittance, reflectance, and loss for thin metal film with N = 0.30 
— j-3.0. 
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Fig. 6 — Transmittance, reflectance, and loss of evaporated gold films on fused 
quartz substrates. 
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of a detergent, distilled water and alcohol. They are dried with 
tank nitrogen and vapor degreased in isopropyl] alcohol in the 
apparatus described by Holland.”! 

The substrate is transferred into a VE-400 (Vacuum Electronics, 
Inc.) vacuum system which is pumped down to a pressure of 
2-10~7 torr. Due to the location of the ionization gauge, which 
is between the diffusion pump and the liquid nitrogen cold 
trap, the pressure in the bell jar is an order of magnitude higher. 
Gold is evaporated from a tungsten coil located 6 inches from 
the substrate with estimated deposition rates in the range of 5 
to 10 A/sec. The quartz substrate is not heated. 

The de resistance of the film is continuously monitored during 
evaporation with two silver contacts shown in Fig. 6. Additional 
silver contacts are applied immediately after the gold evapora- 
tion in order to sandwich the gold layer between two layers of 
silver. This method insures minimum contact resistance between 
the silver and the gold. All three layers (Ag, Au, Ag) are applied 
consecutively without opening the vacuum system. 


The thickness of the films is measured with a multiple beam inter- 
ferometer; e.g., the film with a 5.7 ohm/square sheet resistance has a 
thickness of 180 + 15 A. This particular sheet resistance is approxi- 
mately 4.5 times higher than that which one would obtain from the 
resistivity p of bulk gold for a thickness of 180 A (p = 2.44 X 10-® ohm 
cm at room temperature). One of the reasons for this discrepancy is the 
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fact that conduction in thin films depends upon the scattering from the 
film boundaries. This means that bulk resistivities cannot be achieved 
for thin films. Another effect of importance is that the film may be 
discontinuous; that means the film consists of a number of islands with 
partial bridging between adjacent islands as described by Chopra” 
and Irancombe and Sato.?’ The optical properties of such a film can 
be characterized by a complex index of refraction provided that the 
average distance between neighbouring islands is a small fraction of the 
optical wavelength. 

The reflectance and transmittance curves shown in Fig. 5 are calcu- 
lated for a metal film which is not supported by any substrate. Fig. 7 
is a similar plot for a film supported by a substrate with an index of 
refraction N = 3.75. This particular index corresponds approximately 
to silicon with N = 3.75 — 7:0.18 at A = 6328 A. Comparison with Fig. 
5 shows that the reflectance for a specified film thickness is higher. The 
loss in the metal film is lower because of the increased reflectance. 
Reflectance, transmittance, and loss are the same for very thick films as 
shown by the calculated points for d = 1000 A. 


IV. OPTICAL MATCHING OF A METAL-SEMICONDUCTOR CONTACT 


A metal-semiconductor contact can be optically matched at a speci- 
fied wavelength if the reflection coefficient or the surface impedance is 
known for that particular wavelength. 
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Fig. 7 — Transmittance, reflectance, and loss of metal film on dielectric sub- 
strate. 
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The reflection coefficient 7 at the interface of two media in Fig. 8 is 
given by 


gi — go 
= . 1 
gi + Qo (1) 





The quantities g, (& = 0, 1) are generalized impedances or admittances 
(immittances) of the two media and yp, (k = 0, 1) is the permeability 
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Fig. 8— Plane wave reflected and transmitted from plain boundary. 


of the medium. The immittance is directly related to the index of refrac- 
tion of that particular medium. The transmission coefficient ¢; , is 


— 791 
: gi + go 2) 

Equations (1) and (2) are exactly identical with the equations used 
for computing the voltage or current reflection coefficient and the trans- 
mission coefficient for two adjacent RF transmission lines at different 
impedance levels. 

A sequence of plane parallel films can be treated by applying (1) 
and (2) with a recursion formula which takes into account the phase 
shift between two adjacent media. The exact procedure is derived by 
Wolter.%* The result with a sequence of three media for the reflection 
coefficient 72 and the transmission coefficient ¢, is 


(ge — gi) (gi + go) exp (prdi) 
_ + (g2 + gi)(g1 — go) exp (—pids) (3) 
(go + gi) (gi + go) exp (pidi) 
+ (g2 — gi) (gi — go) exp (—pids) 
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= 4gig2 
2 (ge + gi) (gr + go) exp (prs) (4) 
+ (g2 — g1)(gi — go) exp (—prds) 


with the notation shown in Fig. 9. The quantities g;, are again the im- 
mitances of the media. The exponential term exp (+p1d1) represents the 
phase shift between the two adjacent boundaries shown in Fig. 9. 

Equations (8) and (4) are valid if the impedance or admittance of 
the center medium is complex; e.g., if it is a metal. A plane wave launched 
in medium 2 will excite a hybrid wave in medium 1; that means planes 
of equal phase and of equal amplitude will not coincide unless one deals 
with normal incidence. A wave with parallel planes for equal phase and 
equal amplitude can be propagated in an absorbing medium. Such a 
wave, however, cannot be excited by a plane wave coupled into the 
absorbing medium through a plane boundary at an oblique angle. This 
property leads to an index of refraction which is a function of the angle 
of incidence. Further details may be found in the original work by 
Fry 2526 

The reflection coefficient 72 in (3) should be made as small as possible 
for building devices with a high transmission into the substrate. This 
cannot be achieved for a metal-semiconductor structure. It is possible, 
however, to deposit an additional film on the metal and to compensate 
the complex reflection coefficient by proper choice of the index of refrac- 
tion and the thickness of this antireflection coating. The surface im- 
pedance on top of medium 1 in Fig. 9 which represents the metal is 


1 — re 


GS ge Se 


(5) 
with U being the real part and 7V the complex part of the surface 
impedance. Gy can be matched to an impedance G2 by a dielectric layer 
with a proper thickness D and a proper impedance G, if 


2 Vv’ 
Gi = Ge ae + v} (6) 
od GU — G 
D= oan are tan (@ us) . (7) 


The notation is shown in Fig. 10. The quantity n is the index of refrac- 
tion of the dielectric material. For an interface with a real surface 
impedance Gop = U, one obtains with V = 0 from (6) and (7) 


Gi =%V GoG2 (8) 
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Fig. 9 — Reflection and transmission for 3 media. 


1A 
D= ry (9) 
This is the well-known relationship for a quarter-wave . transformer 
connecting microwave transmission lines at different impedance levels. 
The wavelength \ is the vacuum wavelength. 

A practical example is treated in Fig. 11. Reflection coefficients for 
a Au-Si structure are plotted in the complex plane for a gold layer with 
a thickness of 100 A and 200 A. The example refers to normal incidence 
at a wavelength of \ = 6328 A. It is assumed that the index of refrac- 
tion is Ni = 0.28 — 7-38.01 for gold and No = 3.72 — j-0.18 for silicon. 
The index of refraction and the thickness of the antireflection coating 
are listed in Table I. 
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Fig. 10 — Impedance matching with antireflection coating on medium No. 1. 
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Fig. 11 — Surface impedance of gold-silicon Schottky barrier at \ = 6328 A. 


The loss in the gold film and the reflectance and the transmittance of 
the complete structure is shown in Fig. 12 and Fig. 13. All three quan- 
tities are plotted as a function of the thickness of the gold film for two 
fixed antireflection coatings with n = 2.30, D = 500 A and n = 3. 30, 
D = 240 A. Minimum reflectance is achieved as predicted in Table I. 
All curves are obtained by applying (3) and (4) with a recursion formula 
for one additional layer. 

The conclusion from the results of Figs. 12 and 13 is that transmission 
with low loss into the silicon substrate is feasible. 

The reflectance achieved for three evaporation processes with zinc 
sulfide deposited on a Au-Si surface barrier is shown in Fig. 14. Gold 
layers with sheet resistances in the range of 6 ohm/square are first 
evaporated on epitaxial silicon. Zinc sulfide is evaporated on the gold 
layer. The return loss at \ = 6328 A is continuously measured with an 
optical reflectometer and a He-Ne laser as a signal source. The re- 
flectometer is similar to the one described by Perry.2” The measured 
return loss is calibrated in dB. The evaporation is continued after 
reaching the first minimum in one case in order to show the periodicity 
of the process. One concludes that an improvement of 8 dB to 9 dB in 
return loss is possible with a single layer of zine sulfide. The return 
loss without the matching layer is 8 dB. The total return loss is therefore, 
11 dB to 12 dB. 


TABLE I 


Thickness of, Coating Thickness of Coating 


Thickness of Au Film Index n of Coating in A i ‘Tertis- ct Phace. Angle 








100 A 2.28 510 A 66° 
200 A 3.33 242 A 46° 
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Fig. 12 — Reflectance, transmittance, and loss from Au-Si surface barrier with 
500 A thick ZnS antireflection coating. 


It is difficult to measure the transmittance or the loss in the metal 
for a ZnS-Au-Si structure. Some indication may be obtained from the 
measurement of the net quantum efficiency of the device if all the carriers 
can be collected and if there is no internal multiplication. Another direct 
method is to use the fact that the losses in the metal will increase its 
temperature and change its resistance. The resistance changes could be 
simulated by obtaining the same increase with a de current flowing in 
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Fig. 13 — Reflectance, transmittance, and loss from Au-Si surface barrier with 
240 A thick ZnS antireflection coating. 
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Fig. 14 — Return loss from Si-Au surface barrier during deposition of antire- 
flection ZnS layer. 


the Au film. The same procedure is used for power detector calibrations 
in the microwave frequency range. 

The transmittance through Au-ZnS has been measured for a slightly 
modified case shown in Fig. 15 using a l-mm thick quartz slide as a 
substrate. A 5-ohm/square Au film is evaporated on the quartz. The 
reflectance is 40 percent and the transmittance is 48 percent at 6328 A 
as shown in Fig. 6. Zine sulfide is then evaporated on the Au. The return 
loss and the transmission are continuously measured with a double 
reflectometer at ) = 6328 A mounted inside the vacuum system. The 
double reflectometer records transmitted and reflected power simul- 
taneously as shown by the coincidence of maxima and minima on the 
time scale in Fig. 15. The calibration in. dB is obtained with a set of 
standard optical attenuators. The results from this experiment are 

(t) The transmittance is improved by 2.5 dB. This means that 76 
percent of the incident light is transmitted. 

(it) The reflectance is decreased by 10 dB which means that the 
reflectance is reduced from 40 to 4 percent. 
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Fig. 15 — Return loss and transmission of gold film on quartz substrate during 
deposition of ZnS antireflection coating. 


(zit) The process is periodic which means the losses in the ZnS are 
small. 
(iv) The evaporation process can be interrupted at any time and 
‘ resumed later without changing the periodicity of the process 
and the levels of the minima and the maxima. 


V. DESIGN OF THE SCHOTTKY BARRIER PHOTODIODE 


5.1 Optical Absorption and Carrier Generation in the Schottky Barrier 


The absorption coefficient a of the semiconductor and the width of 
the depletion layer w are important parameters for designing a Schottky 
barrier photodiode. The reason for this is as follows. The photocurrent 
through the depletion layer consists of two contributions. One is due to 
the carriers created within the layer, the other is due to carriers gen- 
erated in the adjacent bulk material which diffuse later into the junction. 
Minority carriers which enter the junction by diffusion will be swept 
across the junction by the applied external field. This diffusion current 
may lead to delay distortion if the incident wave is pulsed or rf modu- 
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lated. The diffusion current is small if most of the optical power is 
absorbed within the depletion layer. This requires 


ioe (10) 


a 
The upper limit for w is determined by the transit time which can be 
tolerated for the carriers. 

The absorption coefficient a of Ge and Si as a function of wavelength 
is given in Fig. 16. The width of the depletion layer in a uniformly doped 
material with a carrier concentration N under an applied external voltage 
V is given by Kahng?.?8 as 


/2e(Vp + V) 
ae (11) 
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Fig. 16 — Optical absorbtion coefficient of silicon and germanium at 300° K. 
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where Vp is the diffusion potential, « the dielectric constant and q the 
electron charge. The diffusion potential is the potential difference of 
the conduction band level between its value at the surface and its value 
inside the bulk material. Diffusion potentials of various metal-semi- 
conductor combinations can be obtained from data supplied by Cowley 
and Sze.*° The diffusion potential in a Schottky barrier photodiode is 
usually much smaller than the applied back bias V because of the re- 
quirement w > 1/a. With g = 1.60 X 10-!° coulomb and e = 8.85 X 
10- e, farad/em one obtains 


w = 105 X 10" / ont niciou: (12) 


N is the doping level in carriers/ce and w is measured in ym or micron 
(1 um = 10-4 cm). The ratio of capacitance C to junction area A is 
Cit. ,/__gN __ 
A w 2V,+ V)~ 
The width of the depletion layer for Ge and Si as a function of the 
total voltage Vp + V is shown in Figs. 17 and 18. The breakdown limit 


(13) 
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Fig. 17 — Depletion layer width in n-type silicon as a function of potential 
difference V + Vp across depletion layer. 


1628 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1966 


DEPLETION LAYER WIDTH IN MICRONS 


GERMANIUM 





10 20 40 60 100 200 400 
POTENTIAL DIFFERENCE ACROSS DEPLETION LAYER IN VOLTS 


Fig. 18 — Depletion layer width in n-type germanium as a function of poten- 
tial difference V + Vp across depletion layer. 


refers to an abrupt junction in the bulk. It is desirable that the deple- 
tion width satisfy (10). Moreover, for any particular application, the 
thickness w should be no thinner than is necessary to achieve a cutoff 
frequency which is twice the maximum operating frequency since the 
maximum available power from a photodiode depends inversely upon 
the square of the diode capacitance. It is not always possible to satisfy 
these requirements because of transit time considerations or because of 
material properties. The drift current for a specified depletion layer 
width w is 


Jorite = gol — € ™) (13) 


where ¢ is the incident photon flux at the front of the depletion layer 
and q the electron change. One obtains e.g., for w = 2/a a value of 0.86 
for the reduction factor 1 — exp (—aw). The total current will be larger 
because 14 percent of the radiation will be absorbed beyond the deple- 
tion layer in the bulk material and create diffusion current. The exact 
amount of diffusion current under static condition may be found in a 
paper by Gartner.*° 
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5.2 Frequency Limitations of the Photodiode 


The frequency limitations for Schottky barrier photcdiodes are de- 
termined by the transit time of the carriers through the depletion layer, 
the sheet resistance of the metal film, the resistance of the bulk material 
and the capacitance of the junction. 

The transit time for a carrier depends on the type of carrier and the 
location of its origin within the depletion layer. The carriers may reach 
saturation velocities for sufficient high field intensities; e.g., 10” em/sec 
for electrons in silicon. The holes will move at lower velocities. One has 
to remember, however, that holes are created predominantly in the high 
field region in the vicinity of the metal and will travel only a short dis- 
tance to the metal electrode. Electrons will have to travel over a much 
longer distance and through a region of low field intensities as shown in 
Fig. 19. The electron transit time 7,; will thus be the predominant factor. 
This transit time has been calculated by B. C. DeLoach*® by assuming 
that 

(z) electrons reach the saturation velocity v, at the maximum field 
in the junction, and 

(tz) the transit of the carrier through the junction is completed when 

it reaches the field HE, = kT, that means when it joins the free 
carriers with an average energy k7' (0.026 eV at 300°K) to the 
right of the swept space charge. 
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Fig. 19 — Pair creation under incident illumination with electric field intensity 
E in depletion layer. 
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The result for the electron transit time 7,7 is 


ta= (14) 
Typical values which may be achieved in a silicon surface barrier are 
w = 5 microns and v, = 10’ em/sec. This leads to an electron transit 
time 7-: = 5.10-" sec. The corresponding cutoff frequency is f. = 1/Te = 
20 GHz. 

The frequency response of a photodiode with a capacitance C per 
unit area and a sheet resistance R, has been calculated by Lucovsky 
and Emmons.” The cutoff frequency depends on the geometry of the 
diode and in particular on the location of the ohmic contact. Three 
types of contacts shown in Fig. 20 have been discussed by the authors. 
The 3-dB cutoff frequencies for the short circuited diodes are 





ee = Rape linear contact (15) 
We = S ring contact (16) 
‘RC? 
We = dot contact (17) 
R,C- F (a,b) 
with 
2 2 4 
a 3b 2b b 
EON pg gig ae) 


The dimensions a,b are defined in Fig. 20. The formulas have been 
derived for a p-n photodiode with the p-layer as the conducting layer. 
They remain fully valid if the p-layer is replaced by a thin metal film 
with a sheet resistance of R, ohm/square. The cutoff frequency for the 
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Fig. 20 — Contact shapes for ohmic contacts on thin metal film of Schottky 
barrier photodiode. 
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linear contact shown in Fig. 20 is independent of the dimension L be- 
cause identical diodes connected in parallel will display the same fre- 
quency response if they are operated under short circuit conditions. 

All three types of contacts can be used for Schottky barrier photo- 
diodes. Ring contacts will give the highest cutoff frequencies for a speci- 
fied diode area and given material properties. The fabrication of dot 
contacts is simpler because the ring contact has to be deposited by mask- 
ing off the center area of the diode for the deposition of the contact ring. 
Dot contacts can be evaporated through an ordinary metal mask with 
an array of holes. This makes dot contacts particularly useful for photo- 
diode arrays or for the fabrication of a large number of photodiodes on 
a single wafer which can be sliced up later. It is convenient to set the 
contact dot off center in order to facilitate the attachment of an external 
connection without interfering with the incident light beam. This is 
shown in Fig. 21. The large dots are semitransparent gold films on Si 
with a diameter of 10 mils and a sheet resistance of 5 to 7 ohm/square. 
The small contact dots have a diameter of 3 mils. The capacitance of 
each diode for a substrate material with a resistivity of 2.7 ohm cm at 
a back bias of 60 volt is 0.9 to 1.0 pF. The cutoff frequency cannot be 
calculated from (17) for the dot contact because the contact in Fig. 21 
is off centered. A good approximation is obtained by applying (15) for 
the linear contact with 6 being the diameter of the semitransparent gold 
film. The cutoff frequency obtained for this particular diode at the speci- 
fied back bias of 60 volt is f. = w./2r = 22 GHz. This cutoff is of the 
same order as the cutoff frequency obtained from transit time considera- 
tions. 





Fig. 21 — Array of Schottky barrier photodiodes on epitaxial silicon wafer 
before deposition of antireflecting coating. The diameter of the large semitrans- 
parent gold dots is 0.25 mm, the diameter of the small gold contact dots is 0.075 
mm. 
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VI. FABRICATION AND PACKAGING OF Si-AU-ZNS PHOTODIODES 


The combination of materials for Schottky barrier photodiodes de- 
pends on the frequency range of the incident radiation and the metal- 
lurgical properties of the metal-semiconductor system. Silicon and ger- 
manium are both suitable for the visible range of the spectrum. Stable 
Schottky barriers can be formed with a number of other metals, e.g., 
Ag, Al, Pt, and Ni. The eutectic temperatures of the various metal- 
semiconductor combinations determine the maximum device tempera- 
ture. The eutectic temperature of Au-Si is 870°C. The choice of the 
matching coating is governed by the optical surface impedance of the 
metal-semiconductor combination. A total optical return loss of 12 dB 
can be achieved with ZnS which has a dielectric constant «, = 2.3. A 
higher return loss may be desirable; however, the stability of ZnS and 
the good adherence to the Au is an advantage compared to other di- 
electric materials. 

The surface preparation of the semiconductor substrate is relatively 
simple. Epitaxial silicon wafers n on n+ with alloyed gold antimony 
back contacts are rubbed with a clean cotton swab under methanol, 
boiled in distilled water, etched in HF’, washed in distilled deionized 
water, washed in methanol and finally dried with nitrogen. The wafer 
is covered with a molybdenum mask with an array of 10-mil diameter 
holes. Only 2 of the wafer are covered by this mask. The remaining 4 
of the wafer is later used for test purposes of the optical return loss dur- 
ing the evaporation of the antireflection coating. The unit is transferred 
into the vacuum system which is pumped down to a pressure of 2-3 X 
10~ torr measured at its pumping port. Gold is evaporated from a tung- 
sten coil. The sheet resistance of the gold is measured on a 1-mm quartz 
slide which is located adjacent to the silicon wafer. The evaporation is 
discontinued when the sheet resistance measured on the quartz slide 
is in the range of 5 to 7 ohm/square. Separate measurements have shown 
that the sheet resistance of the Au on the Si wafer is also in the 5 to 7 
ohm/square range. 

A second deposition of Au through a molybdenum mask with an array 
of 3-mil holes is made on the wafer. The 3-mil Au dots are needed later 
for contacting purposes. A photograph of the semitransparent Au dots 
and the contacting 3-mil dots is shown in Fig. 21. The second mask is 
removed after the Au evaporation and the unit is mounted in a vacuum 
system which is equipped with an optical reflectometer at } = 6328 A. 
Zine sulfide is evaporated on the wafer. The reflectance from the test 
area on the wafer is measured continuously and the evaporation is 
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stopped at the first maximum of the return loss. Typical results obtained 
from the reflectometer recording are shown in Fig. 14. 

The step of depositing 3-mil Au contacts is repeated in order to facili- 
tate thermocompression bonding to the contact area. The second mask 
is mounted in exact registry with the first evaporation of contact dots. 
This means that a ZnS layer is sandwiched between two identical con- 
tact dots. This layer is shorted out after completion of the thermocom- 
pression bonding process of a 1-mil Au wire to the contact dot. Fig. 22 
is a photograph of the wafer surface after completion of all evaporation 
processes. The wafer surfaces shown in Figs. 21 and 22 are both illu- 
minated from a standard tungsten lamp for obtaining the photographs. 

A cross sectional view of the detector packaged into a modified type 
N connector body is shown in Fig. 23 and a photograph of the completed 
structure in Fig. 24. A metallized quartz washer is used for electrical 
separation of the diode terminals. A bypass capacitor provides an RF 
short between the outer conductor of the connector body and the ter- 
minal which is connected to the metal side of the Schottky barrier. 
Various parts of the package are identified in the figure caption. 


VII. MEASUREMENT OF PULSE RESPONSE AND NET QUANTUM EFFICIENCY 


The pulse response of packaged Schottky barrier photodiodes has 
been examined by phase locking the TEMooq modes of a 6328 A He-Ne 
gas laser with an internal phase modulator. The laser output consists 
of pulses with a half width of approximately 0.5 nanoseconds separated 
by 11 nanoseconds. The average optical power is 0.3-0.4 milliwatt. 

A typical pulse response obtained with a completed photodiode dis- 





Fig. 22 — Array of Schottky barrier photodiodes on epitaxial silicon wafer 
after deposition of the antireflecting coating. The dot dimensions are the same as 
in Fig. 21. 
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Fig. 23 — Cross-sectional view of diode package. (1. Silicon wafer. 2. Thin gold 
film. 3. Thermocompression bond on gold contact. 4. Contact wire to quartz 
washer. 5. Quartz washer, top and bottom are metallized. 6. Brass pin. 7. Brass 


adapter ring. 


8. Bypass button capacitor 1000 pF. 9. Inside metal connection of 


button capacitor and external lead for applying de bias to photodiode. 10. External 
metal connection of button capacitor. 11. Brass pin forming part of center con- 


ductor of the 


connector. 12. Steel spring. 13. Teflon spacers. 14. Connector body. 


15. Steel washer.) 





Fig. 24 — Completed diode package showing connector body and external 
lead wire for de bias. 


SCHOTTKY BARRIER PHOTODIODES 1635 


played on a Tektronix sampling oscilloscope Type 661 with a rise time 
of 0.1 nsec is shown in Fig. 25. The diode is made on 2.7-ohm cm epi- 
taxial silicon with the 10-mil diameter dots shown in Figs. 21 and 22. 
The half width of the pulses is 0.45 nsec at a back bias of 50 volts. The 
net quantum efficiency measured with an Eppley thermocouple * 4952 
as a reference is 70 percent. This efficiency is obtained by graphical 
integration of the pulse shape shown in Fig. 25 and by assuming that 
the diode acts like an ideal current source into the 50-ohm broadband 
load of the sampling oscilloscope. 

A close inspection of the pulse shape shows that the leading edge is 
slightly steeper than the trailing edge. The distortion in the trailing 
edge is due to diffusion current and to case capacitance in the package. 
The influence of the diffusion current can be examined by observing the 
pulse shape for various back bias conditions. Fig. 26 is the pulse re- 
sponse of the same diode at a back bias of 0, 4, 15, and 50 volt. A diffu- 
sion tail is clearly visible at a back bias of 0 volt and 4 volt. The diffu- 
sion tail is depressed at higher back bias because more carriers are created 
within the depletion layer. 

An important property required for many practical applications is a 
uniform response of the photodiode over the entire area of the junction. 





Fig. 25 — Pulse response of packaged diode at 50-volt back bias into 50-ohm 
load obtained from phase locked modes of He-Ne gas laser. Horizontal scale 0.5 
nsec/em, vertical scale 20 mvolt/cm. 
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Fig. 26 — Pulse response of Schottky barrier photodiode at 0-volt, 4-volt, 
15-volt, and 50-volt back bias. Horizontal scale 1 nsec/cm, vertical scale 25 mvolt/ 
cm. 





Fig. 27 — Pulse response for 9 points on the same photodiode obtained by linear 
scanning of a focused laser beam over diode area by 0.025-mm increments. Hori- 
zontal scale 2 nsec/cm, vertical scale 25 mvolt/cem. 
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Fig. 27 shows the pulse response of a Schottky barrier photodiode at 
various locations of the diode. A laser beam is focused on the front sur- 
face of the diode and is scanned across the diode. The pulse response is 
measured on an axis at discrete points which are spaced 1 mil apart. 
The peak variation is less than 3 percent over a total] distance of 7 mils. 
The reduced pulse response in the vicinity of the boundaries is due to 
the fact that there is a small thickness change of the antireflection coat- 
ing close to the boundary. This change of thickness is due to different 
sticking coefficients and different surface mobilities cf the zine sulfide 
on gold and on silicon during the evaporaticn process. One cbserves 
therefore a reduced amplitude response with no degradation of the pulse 
shape. 


VII. CONCLUSIONS 


Schottky barrier photodiodes can be used for fast and efficient optical 
detectors. The high efficiency is obtained because radiation can be 
coupled through thin metal films with relatively low loss at optical fre- 
quencies. The small reflectance of the dicde is achieved by proper choice 
of the matching layer. A diode with a fast response is obtained by design- 
ing junctions with a small RC product. The problem is similar to build- 
ing high cutoff Schottky barriers for microwave and millimeter wave 
circuits. Additional limitations are due to transit time effects which 
are common to all solid-state radiation detectors based on carrier gen- 
eration. 
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Topology of Thin Film RC Circuits 


By F. W. SINDEN 
(Manuscript received August 31, 1966) 


Integrated RC circuits can be made by depositing exceedingly thin 
metallic and dielectric films in suitable patterns on an insulating substrate. 
Resistors are strips of conductor; capacitors are patches on which conducting, 
dielectric, and conducting layers are superimposed. Since conductors can 
cross at capacitor patches, RC networks need not be strictly planar to be 
realizable in thin film. 

Determining which RC circuits are realizable poses new problems in 
topology which are remarkably simple to state but are as yet unsolved. The 
results reported here are fragmentary, but they do cover some cases of small 
order that may be of practical interest. 


I. INTRODUCTION 


Integrated RC circuits can be made by depositing exceedingly thin 
metallic and dielectric films in suitable patterns on an insulating sub- 
strate. A resistor is made by depositing a long, narrow strip of conductor 
(usually in a zag-zag for compactness); a capacitor is made by super- 
imposing conducting, dielectric, and conducting layers. Because the 
dielectric is thin, the capacitance per unit area is high. Fig. 1 shows a 
typical thin film pattern. 

Ordinarily printed circuits are strictly planar; crossovers are made 
only by leading one of the conductors entirely out of the plane of the 
circuit. In the thin film technique, however, conductors can be separated 
by thin insulating layers within the plane of the circuit. Thus, cross- 
overs can be permitted provided a nonzero capacitance between the 
crossing conductors is acceptable. If an RC circuit can be laid out so 
that conductors cross only if the circuit requires a nonzero capacitance 
between them, we will say the circuit is realizable in thin film or just 
realizable. 

An example of a realizable nonplanar circuit is shown in Fig. 2. In 
this case, the schematic thin film layout brings out intrinsic symmetries 
not displayed by the circuit diagram. 
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Fig. 1 — Thin film layout for a notch filter (courtesy W. H. Orr). Black region 
is bottom conductor; shaded region is dielectric; white region is top conductor. 


Finding feasible layouts, or even determining when they exist, leads 
to unsolved problems in topology. The results presented here give 
answers only in special cases. Moreover, these results concern only the 
topological side of the problem; electrical equivalences are not taken into 
account. It is assumed that the network is given topologically and that 


(a) (b) 


Fig. 2— (a) Nonplanar circuit (‘‘twin-tee’’, Ref. 3, p. 309); (b) schematic 
thin film layout for the circuit in (a). ° 
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terminals to the outside are located in given fixed positions on the 
periphery of the board. 


II. SEPARATION OF THE RESISTIVE AND CAPACITIVE PARTS 


Given an RC network JN, let Ry be the purely resistive network ob- 
tained by replacing every capacitor by a direct connection. Clearly NV 
is not realizable in thin film unless Ry is. Ry is realizable only if its 
graph (a vertex for each conductor, an edge for each resistor) is planar 
under the restrictions imposed by the locations of the terminals to the 
outside (see Fig. 3). This observation provides a first check: if Ry is not 
planar, there is no need to proceed further. 

Each vertex in the graph of Ry replaces a purely capacitive network. 
In Fig. 3, for example, the vertex V in Ry replaces the network shown 
in Fig. 4. 

One way to construct a realization of N is to construct realizations 
for the individual vertex-networks, and then to fit these into the planar 
layout of Ry . Since the layout of Ry may not be unique (there may be 
more than one ordering of edges about a vertex) the conditions on the 
vertex-networks may not be unique. 

Another approach, discussed briefly in the final section, is to modify 
algorithms for purely capacitive networks to take account of resistors. 
In either case, one needs to study the purely capacitive networks first. 


III. PURE C NETWORKS 


A pure C network is a set of zero-resistance conductors ¢1, +--+ , Cr 
some pairs of which are connected by capacitors. The problem of finding 
a feasible layout for such a network is the following: 

For each conductor c; find a connected region Ff; in the plane stjeli that 

(<) R; and R; have common points if and only if c; and c; are con- 
nected by a capacitor, and 





N Rn 


Fig. 3 — Nonplanar RC network N and reduced purely resistive network Ry . 
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VERTEX-NETWORK FOR VERTEX V REALIZATION 


Fig. 4 — Capacitive network for vertex V of Fig. 3 and realization of this 
network. 


(77) no point belongs to more than two regions. 

Condition (72) says that no more than two conductors (separated by 
dielectric) may be superimposed. If, contrary to condition (77), conduct- 
ing and dielectric layers can be stacked up indefinitely, then every con- 
nected C network has a feasible layout. (The network is connected if any 
conductor can be reached from any other through a sequence of capaci- 
tors.) This is not quite immediately obvious; a proof is given in Appendix 
A.1. 

Indefinite stacking offers other advantages as well.! Unfortunately it 
also presents technical difficulties. To date most thin film circuits have 
been limited to two conducting layers. 

It does not change the problem to replace the connected regions F; 
by curves C; of finite length, since a connected region can be nearly filled 
by a curve of finite length, and a curve of finite length can be approxi- 
mated by a narrow region. When convenient, the curves can have 
branches, although this is not necessary, since a branch can be approxi- 
mated by letting the curve double back. In some cases, a pair of curves, 
whether branched or not, have to cross more than once (examples later). 
Such multiple crossings will be permitted on the assumption that a 
capacitance, if need be, can be distributed over several crossovers. Some- 
times the curves are more convenient and sometimes the regions. I will 
use both. 

In addition to satisfying conditions (7) and (zz) the regions (or curves) 
may have to satisfy constraints associated with the terminals to the 
outside. More specifically, R,, --- , &, may be required to lie within a 
given region R and certain of the R; may be required to contain specified 
points P; on the boundary of R. I will consider mainly the two extreme 
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cases where (a) there are no such terminal constraints and (b) every 
region F; satisfies a terminal constraint. 


IV. UNCONSTRAINED CASE 


The problem is simply stated: It is specified which pairs of a set of 
curves (or connected regions) in the plane cross and which pairs do not. 
When are such specifications consistent? 

To get a feeling for the problem, the reader may wish to try the ex- 
amples in Fig. 5. 

The crossings are conveniently specified by means of a graph G. 
Associate a vertex with each curve, and let two vertices be joined by an 
edge if and only if the corresponding curves are required to cross. If a 
set of curves satisfying the crossing specifications exists, we will say 
that the graph G is realizable. 

If G is planar, then it is realizable. In a planar representation of G 
one has merely to replace each vertex v; by a star-shaped region R 
whose points extend out along the edges emanating from v; far enough 
to overlap the points of neighboring regions. 

The converse is not true; some nonplanar graphs are realizable. For 
instance, any complete graph (nonplanar if the order is greater than 
four) is realizable, for in this case every curve C; crosses every other. 
(Let the C; be straight lines in general position; i.e., no two parallel, no 
three through a point.) 





(a) (b) 


Fig. 5 — Examples of unconstrained case. With the exception of the dashed 
curve, a pair of curves must cross if and only if they cross in the figure. The dashed 
curve must make only the encircled crossings. One of these examples has a solution; 
the other does not. Answers are given in Appendix A.2. 
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Although nonrealizable is different from nonplanar there is a class 
of nonrealizable graphs that is related to nonplanar graphs. If G is non- 
planar, then the graph G* obtained by inserting a new vertex into each 
edge of G is nonrealizable (see Fig. 6). If G* were realizable, one could 
construct a planar representation of G as follows. In a realization of G* 
let each of the curves C; corresponding to an original vertex of G shrink 
to a point in such a way that no new crossings are generated. This is 
always possible. Since by assumption the remaining curves (correspond- 
ing to edges of G) do not cross each other, the resulting figure is a planar 
representation of G. 

A theorem of Kuratowski? states that any nonplanar graph can be 
reduced to one of two minimal nonplanar graphs G, or G2 (Trig. 7) by 
(7) deleting edges and (27) combining adjacent vertices. t 


G cc 


Fig. 6 — Gis nonplanar; G* is nonrealizable. On the right is a nonrealization of 
G*; crossings marked with dots are required, no others are permitted. 


The two operations (¢) and (72) clearly preserve planarity. Operation 
(iz) also preserves realizability, but (z) does not. (If it did, all graphs 
would be realizable, since any graph can be constructed by deleting edges 
of a complete graph, which is realizable.) To preserve realizability it is 
necessary to replace (z) by the weaker operation (z’): deleting vertices 
(together with attached edges). To see that (z’) and (77) do indeed 
preserve realizability one has only to interpret them as operations on the 
curves C;. 

Using operations (7’) and (7) and Kuratowski’s theorem we can 
identify a class of nonrealizable graphs as follows. 

Let Gi* and G.* be the graphs obtained by inserting a new vertex 


+ G, is the graph involved in the familiar problem of connecting three utilities 
(e.g., the gas, water, and electric plants) to three houses without crossing lines. 
Since G, is nonplanar there is no solution. In Fig. 7 vertices 1, 3, and 5 can be taken 
as the utilities and 2, 4, and 6 as the houses. 
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G; G 2 
Fig. 7 — Kuratowski graphs. 


into each edge of the Kuratowski graphs G; and G;. A graph is non- 
realizable if it can be reduced to G,* or G2* by application of (z’) and 
(it). G,* and G.* are themselves irreducible. In Appendix A.2 one of the 
examples in Fig. 5 is shown to be reducible to G,*, hence nonrealizable. 

The analogue of Kuratowski’s theorem which would say that every 
nonrealizable graph can be reduced to G,* or G,* is false. An example of 
a nonrealizable graph that cannot be so reduced is given in Appendix 
A.3. 


V. CONSTRAINED CASE 


In addition to satisfying the conditions (2) and (77) in Section III, 
the curves C; (or the regions R;) will now be required to lie within a 
simply-connected region Rk (which we shall take to be a disk) and each 
C; will be required to contain a specified point P; on the boundary of R. 
(This covers the case where a single conductor is required to join two 
or more separate terminals. One has only to require that the correspond- 
ing curves cross each other; their union represents the conductor.) 

Before proceeding further, the reader may wish to try the examples 
in Fig. 8. 

In passing, we observe that any constrained problem can be imbedded 
in an unconstrained problem. The constraints can be simulated by 
means of a ring structure containing 2r curves, where r is the number of 
curves in the constrained problem. This is proved in connection with the 
example discussed in Appendix A.3. Unfortunately, this observation is 
of little use in the absence of more information about the unconstrained 
case. 

We will regard the vertices 1, , --- ,v, of graph G as residing at the 
terminal points P;, --- , P,. We will often make use of the complement 
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(a) 





Fig. 8 — Examples of constrained case. Curve C; must contain point P; and lie 
otherwise within the circle. The dashed lines show the edges not in G, i.e., if P; 
and P; are connected by a dashed line then curves C; and C; may not cross; other- 
wise C; and C'; must cross. One example has a solution, the other does not. Answers 
in Appendix A.4. 


G of G, where G@ consists of all edges not in G. Edges in G will be shown 
as solid lines, edges in G as dashed lines. 

A subset of vertex points P;,,--- , P:, such that 41 < t2 << +++ <4, 
will be called a cycle if all the pairs 


(Pec Poiana tie ia) 


are joined by edges. A cycle will be called empty if no other pairs are 
joined by edges. We will be primarily concerned with empty cycles in 
the complementary graph G. (See Fig. 9) 


Theorem 1: A necessary condition for a constrained graph G to be realizable 
is that G contain no empty cycles of order four or more. 


Proof: (i) If G is an empty cycle of order four, then G is not realizable. 
This is easily verified by inspection. If, therefore, G contains an empty 
cycle of order four, then G is not realizable. 





Fig. 9 — (a) Empty cycle in G, (b) non-cycles. Dashed edges belong to G; 
edges not shown belong to G. 


TOPOLOGY OF THIN FILM RC CIRCUITS 1647 


(22) Suppose the theorem is known to be true for cycles of order 
4, --» ,m — 1 and suppose, contrary to the theorem, that G contains 
an empty cycle of order m and that G is realizable. The realization of G 
can be generated in the following way: let curve C, grow continuously 
out of point P,; until it reaches its full length, then let curve C2 grow 
out of point P. until 7 reaches zts full length, and so on until all curves 
are complete. 

Let G(t) be the corresponding complementary graph at time ¢. At 
the beginning, G(é) is the complete graph (no crossings) ; as the crossings 
are generated one by one, edges are deleted from G(é). At some stage the 
postulated empty cycle of order m, which is contained in the final form 
of G, must have just one internal edge left. But this last internal edge 
forms two empty cycles inside the final cycle, at least one of which 
must be of order four or more (since m > 4) and less than order m. 
Therefore, by the induction hypothesis, there can be no realization at 
this intermediate stage. Contradiction. 


For some time it appeared to me that the empty cycle condition was 
not only necessary for the realizability of a constrained graph, but suffi- 
cient as well. Recently, though, I found a counterexample of order eight. 
This example is discussed in Appendix A.5. 

Following are a number of results that help to identify and construct 
special classes of realizable constrained graphs. Taken together these 
seem to cover most cases of small order. 

If no two edges of G cross, then clearly G is realizable. Less obvious 
is a similar result for G: 


Theorem 2: A sufficient condition for a constrained graph G to be realizable 
ts that G contain no empty cycles of order four or more and that no two edges 
of G cross. 


An example of such a G is the triangulated polygon of Fig. 8(b). This 
example, typical of the genre, has a complicated solution with unavoida- 
ble multiple crossings. 

Theorem 2 is proved in Appendix B. A more general result, also proved 
in Appendix B, is the following: 


Theorem 8: (t) If (P1, Px) is an edge of G that crosses no other edges of G, 
and tf the subgraphs G’ with vertices P; , P2, +--+ , Px , and G” with vertices 
Pi, +++, P,, Py, are both realizable, then G is realizable. 

(i) If (P1, P;) is an edge of G that crosses no other edges of G, and if 
subgraphs G’ with vertices P:, P2, +--+, P., and G” with vertices P, +++ , 
P,, P, are both realizable, then G ts realizable. 
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The following two theorems describe circumstances under which a 
new curve C,,: can be added to an existing solution. In many cases 
the entire solution can be generated by adding curves one at a time. 


Theorem 4: Let G be a constrained graph with vertices P1,---, P,, Pra. 
G is realizable if (1) the subgraph of G with vertices P;, --- , P, is realizable, 
and (it) there do not exist three vertices P;,P;,P.,i<7<k<rt+i1 
such that Pri,P; and P,4:P, are edges of G and P;P, and P,P; are 
edges of G. (See Fig. 10.) 


Though cumbersome to state, this theorem is usually easy to apply. 
The following special cases are often useful by themselves. Let S be the 
set of vertices joined to P,4: by edges of G. Special case 1: the vertices 
of S are an adjacent string. Special case 2: every pair of vertices in S 
is joined by an edge of G. Special case 2, for instance can solve examples 
like 8(b) in which G is a triangulated polygon. One has only to add new 
vertices one at a time in such a way that each additional vertex forms 
one new triangle in G. The set S always has just two members. 

Theorem 4 is proved in Appendix B. Though somewhat involved 
when worked out in detail, the idea of the proof is simple. In the situa- 
tion of Fig. 10 the curves C; and Ci (emanating from P; and P;) form 
a barrier which C,,; cannot cross. This does not necessarily prevent 
C,4: from intersecting C’;; , for it is possible that C; could cross the barrier. 
If, however, the barrier is not there, then C,41 can reach C; on its own 
without C;’s help. If there are no barriers of the Fig. 10 type, then 
C41 can reach all of the curves it is supposed to cross no matter how 
these may have been drawn. Thus, the new curve C4: can be added 
without disturbing the old ones. 





Fig. 10 — Configuration forbidden by hypothesis of Theorem 4. Dashed lines 
show edges of G; solid lines show edges of G. 
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The next theorem concerns an operation which I will call an adjacent 
interchange. Given the circle R with the peripheral points P,, --- , P,, 
let R’ be a slightly smaller circle concentric to R with corresponding 
peripheral points P,’,---, P,’. Let the primed points have the same 
order as the unprimed points except for one adjacent pair Pz’,Pr41’, 
which is interchanged. The points P,,--- , P,, can be joined, respec- 
tively, to Py’, ---, P,’ by curves C,--- , C; in such a way that only 
C;, and C4: cross. (See Fig. 11.) 

If the operation is repeated by means of a new circle R” inside PR’, 
then the curves C; are extended inward and one new crossing is gen- 
erated. A sequence of such operations can be specified by giving the pair 
of currently adjacent points that is to be interchanged. 

Theorem 5 states the conditions under which all of the intersection 
requirements of a curve can be satisfied by a sequence of adjacent 
interchanges. These conditions involve cycles in G@ (not necessarily 
empty) as defined just before Theorem 1. Note that the order of vertices 
in a cycle of G is invariant under adjacent interchanges. 

We will say that a member P; of a cycle in G is active if it is joined to 
some other member of the cycle by an edge of G. 


Theorem 5: The intersection requirements of a curve C; can be satisfied 
entirely by a sequence of adjacent interchanges if and only if P; is not an 
active member of any cycle in G. 

Theorems 4 and 5 tend to be complementary; where one fails, the 


other often works. Fig. 8(b) is an example where Theorem 5 fails (every 
vertex is an active member of several cycles) and Theorem 4 works. 





Presi 


Fig. 11 — An adjacent interchange. 


1650 - THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1966 


An example of the opposite kind is shown in Fig. 12. In this example 
Theorem 4 fails (every vertex has the forbidden configuration) but 
Theorem 5 works. The whole realization can be constructed by adjacent 
interchanges. 

A realizable example to which neither Theorem 4 nor Theorem 5 
applies is given in Appendix A.6. This is the smallest such example I 
have found (twelve vertices), but I doubt that it is really minimal. 


VI. ORDER OF CROSSINGS 


It is possible to obtain directly from the graph G information about 
the order in which crossings must occur along a given curve C;. This 
information is contained in configurations I will call empty chains. 

An empty chain is a subset of vertex points P;,,P:,,°--,P:, in 
cyclic order such that the pairs (P:, , P:,), (Pi, , Pis), ++- » (Pin_, » Pin) 
are joined by edges of G and all other pairs are joined by edges of G. 

An empty chain is just an empty cycle with a gap in it. Since the 
empty cycle is nonrealizable, it is not surprising to find that the realiza- 
tion of the empty chain, though not quite unique, is tightly determined. 
(See Fig. 13.) 


Theorem 6: Let Py, -+-+,P, be the vertices of an empty chain. Along 
curve Ci, the first crossings with C,, +--+ , Cx» must occur in that order; 
the first crossings with Cys2, +--+ , Cn must occur in reverse order. 


The proof is given in Appendix B. 

[Every empty chain of length four or more yields ordering informa- 
tion. If, for instance, P,, Ps, Ps, P7 is an empty chain, then C; must 
cross C; before it crosses Cs and C7; must cross C; before C.. Since most 





_ Fig. 12 — No vertex is an active member of any cycle in G, therefore, a realiza- 
tion exists. 


TOPOLOGY OF THIN FILM RC CIRCUITS 1651 





Fig. 13 — Realization of the empty chain of order seven. 


examples of interest contain several such empty chains, this theorem 
is very generally applicable. The example of Fig. 8(b), for instance, 
contains six empty chains of order four and one of order five, which 
together give complete information about first crossings. 

Searching for empty chains is tedious to do by hand, but could easily 
be done by machine. 

A weakness of Theorem 6, evident in the example of Fig. 8(b), is 
that it says nothing about multiple crossings. It is clear in many ex- 
amples that multiple crossings are determined by G. A way of extracting 
this information would be very useful. 


VII. CONSTRUCTION OF SOLUTIONS — SUMMARY 


The preceding results are not strong enough to define a guaranteed 
procedure for constructing realizations of constrained graphs. They do, 
however, seem to work in most cases of small order. To apply them one 
can proceed as follows: 

(¢) Look for empty cycles in G of order four or more. If any exist, 
G cannot be realized (Theorem 1). 

(ii) Look for edges of G (or G) that do not cross other edges of G 
(or G). Such edges, if internal, permit the graph to be broken 
into two independent parts (Theorem 8). 

(z7t) Look for vertices that are free of the configuration shown in 
Fig. 10. Such vertices can be temporarily deleted, since the 
corresponding curves can be drawn in after the remaining curves 
have been drawn (Theorem 4). 

(zv) Look for vertices that are not active members of any cycle in 
G. These are typically on tree-like branches of G. The corre- 
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sponding curves can be constructed either at the beginning or 
the end by means of adjacent interchanges (Theorem 5). 

(v) Find all the empty chains of order four or more and write down 
all the ordering relations they imply. Try to locate each crossing 
on both of its curves. This cannot always be done uniquely. 

In a systematic procedure one could combine 1, 4, and 5 since these 
all involve chains and cycles. 

Chains and cycles in G seem to be important in this problem; they 
certainly yield much information. But apparently they are not enough. 
To set up necessary and sufficient conditions for realizability, some 
other element is needed. 


VIII. LOOSE ENDS 


So far we have considered only completely constrained and completely 
unconstrained graphs, corresponding to networks where none or all 
of the conductors are connected to outside terminals. In general, of 
course, one wants the intermediate case where only some of the con- 
ductors are connected to outside terminals. This remains to be studied. 

The preceding results can be used to construct realizations for the 
pure C networks represented by the nodes of the resistive network Ry . 
(See Section II.) Alternatively, one can generalize the pure C problem 
as follows to take account of resistors @ prior?. 

The graph G can be replaced by its associated matrix A, where 
a;; = X (for “crossing”’) if conductors C; and C; are connected through 
a capacitor (or a short circuit) and a,; = 0 (for ‘“‘no crossing’’) if C; and 
C; are not so connected. To take account of resistors, we let ai; = T 
if C; is connected to C; through a resistor but not through a capacitor. 
This will mean topologically that C; and C; must touch without crossing. 

T and X can be defined more precisely as follows. Consider instead 
of the curves C; the regions R;. We can assume that the A, are simply 
connected. If a;; = T, then the part of R,’s boundary that lies inside 
R; must be connected (i.e., a single piece). If a;; = X then the part of 
R,’s boundary inside R; may (but need not) consist of several pieces. 

A. J. Goldstein has observed that in constructing an algorithm, the 
regions R; have advantages over the curves C; . (The ends of the curves 
have an unnecessarily special character.) He suggests that an algorithm 
might be constructed that would keep track of all of the pieces of the 
boundaries of the R; and take, so far as possible, only steps that are 
topologically mandatory. Such an algorithm could easily take account 
of both 7 and X connections. This idea has not been worked out in 
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detail and we do not know how often one would be forced to take an 
arbitrary step that might be wrong. 


APPENDIX A 


Examples and Answers 


A.1 If indefinite stacking of conducting and dielectric layers is per- 
mitted, then any connected G is realizable regardless of the positions 
of the outside terminals. A universal realization can be constructed as 
follows. 

Since G is connected, there is a path in G that contains every vertex 
at least once. In their order along this path let the vertices bev; , +--+ , vn. 
Over a disk D, stack n layers of conductor separated by layers of di- 
electric. Associate the conductors with the vertices of G according to 
their order along the path. This is permissible since the conductors 
have nonzero capacitances only with their ncighbors in the stack. These 
capacitances correspond to the edges in the path. An extension of any 
conductor can be brought out of the stack radially in any direction. 
Thus, any pair of conductors required to have a nonzero capacitance 
can be brought out together and superimposed in an arbitrarily long 
radial strip. Similarly, any conductor can be brought out in the appro- 
priate direction to connect to an outside terminal. 

Although this construction shows the existence of a topological 
realization, it would hardly do as a practical layout in every case, even 
if indefinite stacking were permitted. Some of the metrical difficulties 
can be overcome by substituting an annulus for the disk D, but even 
so, this construction should be regarded as an existence proof, not as a 
practical solution. 


A.2 Answers to the Examples in Fig. 5. 


The example (a) of Fig. 5, constructed by R. L. Graham, was the 
first nonrealizable example found. It turns out to be of the type dis- 
cussed in the text. Its graph is shown in Fig. 14(a). By deleting vertices 
and combining adjacent vertices it can be reduced to the graph shown 
in Fig. 14(b), which is a Kuratowski graph with a vertex inserted into 
each edge. Therefore, the example is nonrealizable. (See discussion 
subsequent to Fig. 5.) 

Example (b) of Fig. 5 has the solution shown in Fig. 15. 


A.8 Fig. 16(a) shows a nonrealizable graph which does not contain 
elther of the augmented Kuratowski graphs G,* or G,*. The outer ring 
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(a) (bo) 


Fig. 14 — (a) Graph for example (a) of Fig. 5. (b) Reduced graph G;*. 


(B and C vertices) simulates terminal constraints; the inner part (A 
vertices) is a constrained graph (empty cycle of order 5) that is known 
to be nonrealizable. 


Proof: (t) The graph G of Fig. 16(a) cannot be reduced to G,* or G2*. 
The operations (z’) and (27) always reduce the number of vertices. But 
G already has the same number of vertices (fifteen) as G.* and G2*. 
(iz) G is nonrealizable. Suppose a realization exists. In this realization 
let C be the union of C-curves (Fig. 16(b)). No A-curve intersects C. 
Therefore all A-curves must lie in the same mesh of C. Call the interior 
of this mesh R. RF is (or may be) partitioned into subregions by segments 
of B-curves. We will show that all zntersections between pairs of A-curves 
lie within the same subregion of FR. 

The A-curves may be indexed so that in the cycle A, , Ao, --: , As, At 
each curve intersects only its neighbors. Let J be an intersection between 


a iM, 
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_ Fig. 15 — Solution to example (b) of Fig. 5. Both triangles can be drawn out- 
side the hexagon. 
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. Fig. 16 — Nonrealizable graph which does not contain either of the augmented 
Kuratowski graphs G* or G,* and a partial realization. 


A, and A iz1¢noa 5) and let J be an intersection between A; and A j+1¢moa 5) « 
There exist two distinct paths along A-curves joining J and J. One path 
P, traverses segments of Ait, Aize,-+- ,A;z and the other path P» 
traverses A;, Ai1,-:++ , Aju (indices mod 5). (In case z = 7, Pi tra- 
verses Ai; and P» traverses A;.) The sets of A-curves represented in 
the two paths are disjoint. Since each B-curve can cross only one A-curve 
and cannot cross any other B-curve it is not possible for a continuous 
boundary made up of B-curves to cross both P; and P2. Therefore, I 
and J cannot belong to different subregions. 

Let R* be the subregion to which all A-intersections belong. The 
boundary G of R* is made up of segments of B and C curves. (Every 
B-curve is represented since every A-curve must intersect its correspond- 
ing B-curve and could leave R* only at a point belonging to this curve.) 
If b,,b2,--+:,6s5 are any points on G belonging to B,, Bo, ---, Bs, 
respectively (indexed according to the BC cycle), then the points 
b;, °°: , 6s must lie in cyclic order around G. If not, it is possible to find 
a subset of four out of cyclic order, say b, , 63, be, bs. But b; is jomed 
to be by a path lying within B,, C1, Bz, and bs is joined to bs by a path 
lying within B; , C3; , Bs. These paths cannot cross, yet must be outside 
R*. This is not possible under the postulated ordering bj, b3 , b2, b4. 
All other noncyclic orderings can be similarly ruled out. 

The points at which the A-curves join G must, therefore, lie in the 
order determined by the BC cycle and all intersections between A-curves 
must lie within R*. But these are the conditions of a constrained case 
known (Theorem 1) to be nonrealizable. 
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A.4 Answers to Examples in Fig. 8 


Example (a) of Fig. 8 has no solution (empty cycle); example (b) 
has the solution shown in Fig. 17. 





Fig. 17 — Solution to example in Fig. 8 (b). Curves C3 and Cs, cross thrice. 
Multiple crossings are unavoidable in this example. 


A.5 Counterexample to the conjecture that all constrained graphs 
free of empty cycles of order four or more are realizable. Fig. 18 shows 
the graphs G and G for this example and a near-realization in which 
only one required crossing does not occur. 

The lack of empty cycles of order four or more can be verified by 
inspection; the nonrealizability can be shown as follows. 

Consider curves 4 and 8, which do not cross. Since curve 2 crosses 
both of these, there exists a path from vertex 4 to vertex 8 traversing 
curves 4, 2, and 8. In case of multiple crossings, there may be more than 
one such path. We will assume that the path is chosen so that the seg- 
ment of curve 2 contained in it has no crossings with curves 4 and 8 
except at its endpoints. Since this path is to serve as a barrier, we will 
denote it by Be. 

There exists a similar path traversing curves 4, 6, and 8. We will call 
this one Bg. 

Since curves 2 and 6 do not cross, the barriers B, and B, can have no 
points in common except along a single segment of curve 4 and a single 
segment of curve 8. Thus, the barriers must be related to each other 
in one of two ways shown in Fig. 19. 

Curve 1 cannot cross Bz and curve 5 cannot cross Bs. Thus, the 
barriers cannot be oriented as in case (a) of Fig. 19, for if they were 
curve 1 could not cross curve 5. By a similar argument, case (b) is 
eliminated by curves 3 and 7. Thus, neither case can occur; the example 
is nonrealizable. 
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yet G is not realizable. 


Fig. 18 — Counterexample. G contains no empty cycles of order four or more, 





Fig. 19 — Proof of counterexample. 
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A.6 Fig. 20 shows the realization for an example to which neither 
Theorem 4 nor Theorem 5 applies. The ordering information supplied 
by Theorem 6 is very complete in this case. Only the order of curves 5 
and 6 along curve 2 (and the symmetric counterparts) is unspecified. 
Indeed this could not be specified since either order is feasible. The 
order 6, 5 however, requires multiple crossings. The realization without 
multiple crossings is unique. 





Fig. 20 — Realization for an example to which neither Theorem 4 nor 5 applies. 
APPENDIX B 


Proofs of Theorems 


Theorem 2: A sufficient condition for a constrained graph G to be realizable 
is that G contain no empty cycles of order four or more and that no two 
edges of G cross. 


Proof: The following proof depends on Theorems 3 and 5 whose proofs 
are independent. 

The theorem is certainly true if G has three or fewer vertices. Suppose 
it is known to be true if G has m or fewer vertices. Consider a graph G 
with m + 1 vertices. If G satisfies the hypotheses of the theorem, then 
either G is an empty chain (see discussion preceding Theorem 6) or else 
G has an internal edge. If G is an empty chain, then by Theorem 5 it is 
realizable. If G has an internal edge then this edge separates G into two 
parts as defined in Theorem 8. Each of these parts has m or fewer vertices 
and is free of crossing edges and empty cycles of order four or more 
(by hypothesis), hence by the induction assumption is realizable. By 
Theorem 3, G is realizable. 
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Theorem 3: If (P1, Px) 1s an edge of G (G) that crosses no other edge of G 
(G), and if the subgraphs G’ with verticles P:, P2,---,P, and G” with 
vertices P, ,--- ,P,, P, are both realizable, then G ts realizable. 


Proof: The method of proof is proof by picture (Fig. 21). Case (a): 
(P; , P;,) is an edge of G crossing no other edges of G. None of the curves 
Co, -+-,Cy1 crosses any of the curves Cir, ---,C,. Therefore, 
except for C; and C; the realizations of G’ and G” can be confined to 
separate parts of the disk R. C; and C; can participate in both parts. 
(See Fig. 21(a).) Case b: (P; , P;) is an edge of G crossing no other edges 
of G. Every one of the curves C2, --+ , Cy-1 crosses every one of the 
curves Crii,-°::,C,. The realizations of G’ and G” can be confined to 
the regions labelled with these letters in Fig. 21(b). The peripheral 
terminals for these realizations can be connected to the terminals on 
the periphery of the disk as shown in the figure. The connections to G’ 
can cross G”’s region since this can only generate allowable crossings. 
The required crossings between curves of G’ and curves of G” occur in 
the center of the figure. 





Pk 
Fig. 21 — Proof of Theorem 3. 
Theorem 4: Let G be a constrained graph with vertices P1, ++: ,P,, Pri. 
G is realizable if (1) the subgraph of G with vertices P; , --+ , P, is realiza- 


ble, and (it) there do not exist three vertices P;, P;,Pr,t~<j<k<r+1 
such that P,4.P; and P,s:P, are edges of G and P;P;,, and P,P; are 
edges of G. (See Fig. 10.) 

Proof: We suppose that a realization for the subgraph with vertices 


P,,--+-,P, is at hand. It will be convenient to think of this realization 
as made up of regions f,, --- , &, instead of curves. To simplify the 
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notation later on we will designate the disk R to which the realization 
is confined by the indexed name Ry. We may assume that R; intersects 
the boundary of Ro only in the vertex P;. 

Let R* be that connected piece of Ro which contains P,i, but is 
exterior to the regions that R,,; may not intersect. R* is the set of points 
that can be reached by R#,.,. We will show that the boundary of R* 
contains all the vertices corresponding to regions R,.1 must intersect, 
i.e., all vertices joined to P11 by edges of G. 

The boundary of R* can be partitioned into a sequence of segments 
Si, °°:,Sn where S; belongs to the boundary of Rg) and ki) ¥ 
k(t + 1). The segments S; and S, adjacent to P,i; belong to Ry , hence 
k(1) = k(n) = 0. If kd) = 0,1 < i < n, then S; is that segment of 
the boundary of the disk Ro which runs from Py_s) to Prciyiy . (The 
end casesz = 1 andz = n can be included by defining k(0) = k(n + 1) = 
r-+ 1.) 

Now suppose P,1; P; is an edge of G (i.e., Ri; must intersect R,). 
We will show that P; belongs to the boundary of R*. 

Ifj > k(t),7 = 1, --- , n, then P; belongs to S, , hence to the boundary 
of R*. If not, let 7 be the first index such that 7 < k(@i + 1). It is not 
possible that k(z) = j because R; as a region that intersects R41 is 
not involved in the boundary of R*. It is also not possible that 0 < 
k(t) < 7 for this would violate hypothesis (77). (Riciy intersects Riis) 
because segments of their boundaries are adjacent.) Therefore, k(z) = 0. 
Hence, S; runs from Pyci-1) to Piz) . Since kK? — 1) <7 < kG + 1), 
S; must contain P;. Therefore, P; is on the boundary of R*, which 
was to be proved. 


Theorem 5: The intersection requirements of a curve C; can be satisfied 
entirely by a sequence of adjacent interchanges if and only if P; is not an 
active member of any cycle in G. 


Proof: If: A chain is a sequence of vertices P;, , P:,, ++: , P:, in cyclic 
order such that (P:,,P:.), (P:i,, Pi;), °°: , (Pin_,, Pi,) are edges of 
G. For the duration of this proof a chain must have at least three vertices. 
Let the vertices be numbered in clockwise order and suppose P, is 
not an active member of any cycle in G. Let S be the set of vertices 
joined to P; by edges of G. We will show that by a sequence of adjacent 
interchanges the members of S can be moved around the circle and 
finally interchanged with P, . 
S can be divided into three subsets: 
(t) The clockwise set S,: P, ¢ S, if Pi is joined to P; by a chain 
whose intermediate members have indices between 1 and k. 
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(17) The counterclockwise set S,.: P, € Sce if P; is joined to P; by a 
chain whose intermediate members have indices greater than k. 
(wiz) The rest Sp. 
S, and S.- must be disjoint because otherwise P; would be an active 
member of a cycle. Let P; be that member of S, with highest index. P; 
can be interchanged with all vertices with higher indices. Thus, it can 
be moved clockwise around the circle past P;. With P; out of the way, 
the member of S, with next highest index can also be moved clockwise 
past P,. The process can continue until all members of S, have been 
interchanged with P,. Similarly, the members of S,. can be moved 
counterclockwise past P; . The members of Sz can be moved either way. 
Hence, every member of S can be interchanged with P; , which was to 
be proved. 


Only if: Suppose P; is an active member of a cycle. Then it is joined 
to another member P; by an edge of G. P; cannot be brought adjacent 
to P;, because the order of vertices in a cycle is invariant under adjacent 
interchanges. Hence, P; cannot be interchanged with P, . 


Theorem 6: Let P,,-+--,Pn be the vertices of an empty chain. Along 
curve C;, the first crossings with C1, +--+ ,Cy-2 must occur in that order; 
the first crossings with Ci42,--+* , Cy must occur in reverse order. 


Proof: It is only necessary to prove the first part of the statement (con- 
cerning C,, --- , Cy-2) since the second part follows from the first by 
symmetry. The first part is trivially true if k < 3. We assume then that 
k > 3. 

The region bounded by Cy_1 and Cy_3 encloses Cy_2. (See Fig. 22.) 
Since C;, cannot cross Cy_; it must cross C,_3 before it can cross Cy_2 . 


Fig. 22 — Proof of Theorem 6. 
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If k > 4, then there is a curve Cy_4. The region bounded by Cy» 
and C;_4 encloses C,_3. Therefore, C;, must cross either Cy_2 or Ch—4 
before it can cross C,_3. But by the previous argument it cannot cross 
Cy.-2 before Cy_3. Therefore, it must cross Cy_s before Cy,_3. Since this 
argument can be iterated indefinitely, the theorem holds for arbitrary k. 
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Realizability Conditions for the Impedance Function of 
the Lossless Tapered Transmission Line 


By P. L. ZADOR 
(Manuscript received August 2, 1966) 


In the study of tapered transmission lines or accoustical horns, an 
unsolved problem of great practical interest is the determination of the 
taper function (inductance or capacitance per unit length as a function 
of distance; it is assumed that the product of these quantities is unity) 
for the structure which will possess a prescribed driving point impedance 
function. For the case where the structure may be modeled by a cascade 
of sections of uniform transmission line segments, physical realizability 
conditions and a synthesis procedure have been given by B. K. Jinari- 
wala.! For the case of continuous taper no results of a general nature are 
known. 

In this note, we shall give an almost complete characterization ot 
driving point impedances for structures possessing once continuously 
differentiable taper functions. Although the proof of the realizability 
theorem will not be given here, the author wants to point out that the 
sufficiency is, in fact, proved by a construction of the taper function. 
However, this construction is too unwieldly to be of practical use. 

The mathematical formulation of the problem is as follows. 

Suppose that for all complex s y(z,s), 0 S x S lis the solution of the 
Horn equation 


(c(a)y’(a,8))’ = s’c(x)y(@,s) (1) 
satisfying the boundary condition 
. * 
. , us 
y(0,s) =r ey Y (0,s) — c(0) : 
If by driving point impedance we mean the function 


s_ y(I,s) 
Z (s) a e() y’ (Ls) ’ 
then the following theorem is true. 
Necessity: If c(x) is a positive real function continuously differentiable 
on0Q S 2 S/then 
* Unit terminating resistance at zero is assumed. 
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(i) Z(s) is a positive real function. 
(27) There exist entire functions of the exponential type* J, N(s), 
Ds), 7 = 1, 2, such that 


= Ni(s) + N2(s) 
AS) = Dye) # Dale) 
(b) NiDi — NoDe = e's, 
(c) Ni(s) = Ni(—s), Di(—s) 
N2(s) = —N2(—s), Da(s) 


(277) If for real w 


D,(s) 
— D2( —s). 


I 


1 
Z (iw) 


then the function f(w) has an asymptotic expansionf at +© of the 
following kind 





f(w) = Ree "*Z(iw) or Ree” 


flo) v1 +5424 540(4) 
(The constants of course may be different). 


Sufficiency: In order that a complex function Z(s) be the driving point 
impedance of the differential equation (1) for some continuously differen- 
tiable positive taper function C(x) it is sufficient that 
(z’) Z(s) be positive real, 
(iz) there exist complex functions N,(s), D:(s), 7 = 1, 2 of the ex- 
ponential type satisfying (2) (a), (b), (c), and 
(iit’) the function f(w) defined in (3) have asymptotic expansion at 
infinity 


fo) wi + S4h454440(4). 


Remarks: (7) It is conjectured that the existence of the two asymptotic 
expansions are not independent, that is (z’), (iz), and one expansion 
may be sufficient. 

(iz) A similar result is probably valid for infinite transmission lines. 


* The function h(s) is called exponential type l if e~’"M(r) remains bounded for 
all r > 0 but for any l’ < le~!’"*M(r) grows to infinity. Here M(r) = ae | h(s) |. 
s|Sr 


} This means that lim w (f(w) — 1) = a, 
w=too 


lim w! (11 —-1- *.) = b, etc. 
w=s4.00 oo? 
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Substituting the words ‘functions of order unity” ? for “functions of the 
exponential type 1” should yield the correct theorem. 

(iit) As a last conjecture we offer the following. Let f(w) be a positive 
even function possessing the properties 

(a) f(w) has an asymptotic expansion at infinity as in (777’). 

(b) The Fourier transform of 1/f(w) — 1 vanishes outside the interval 
(—2l, 21). 
Then there exists a unique taper function such that if Z(s) is the im- 
pedance function of the associated differential equation (1) then 

. —2iwl 1 
Re Z(iw)e Fly’ 
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